Yiyang Feng
  • About
  • Blog (current)
  • Publications
  • Learning RLHF (PPO) with codes (Huggingface TRL)

    Tech essays of Reinforcement Learning from Human Feedback (RLHF) and Proximal Policy Optimization (PPO) with codes in Huggingface TRL.

    10 min read   ·   September 16, 2023

    2023   ·   NLP   LLM     ·   TechEssays  

    image
  • Reading Notes of How does GPT Obtain its Ability? Tracing Emergent Abilities of Language Models to their Sources

    Reading notes of Yao's notes of "How does GPT Obtain its Ability? Tracing Emergent Abilities of Language Models to their Sources".

    6 min read   ·   February 19, 2023

    2023   ·   NLP   LLM     ·   ReadingNotes  

    image
  • Huggingface parallel training for solving the CUDA out of memory issue

    Document a workable solution for the annoying CUDA Out Of Memory (OOM).

    3 min read   ·   February 12, 2023

    2023   ·   NLP   CUDA     ·   TechEssays  

    image
  • Could you give me a hint? Generating inference graphs for defeasible reasoning

    A reading note about a paper related to defeasible reasoning.

    3 min read   ·   January 24, 2023

    2023   ·   NLP   CausalReasoning     ·   ReadingNotes  

    image
© Copyright 2024 Yiyang Feng. 竹杖芒鞋轻胜马,谁怕?一蓑烟雨任平生。