'Studies' 카테고리의 글 목록

Notice

Recent Posts

Recent Comments

Link

GitHub

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Tags more

Archives

Today

Total

관리 메뉴

목록Studies (10)

코딩하는 임초얀

🟦 DeepSeek R1: Breakthrough in AI Reasoning through Simple RL

Takeaway with Perplexity1. 복잡한 검색 알고리즘 없이도 강력한 AI 추론 모델을 만들 수 있다는 것이 입증됨.2. RL만으로도 AI가 자발적인 추론 능력을 개발할 수 있음.3. 대규모 컴퓨팅 자원 없이도 개인이나 소규모 팀이 추론 모델을 개발할 수 있는 가능성이 열림.4. 간단한 검증 시스템만으로도 효과적인 AI 훈련이 가능함을 보여줌. 원문 링크https://www.linkedin.com/posts/andrew-iain-jardine_deepseek-ais-r1-research-report-reveals-activity-7287457792418820097-z0Xb/ LinkedIn Andrew Jardine 페이지: DeepSeek AI's R1 research report re..

Studies/LinkedIn 2025. 1. 23. 16:03

논문 리뷰 - "Towards Applicable Reinforcement Learning: Improving the Generalization and Sample Efficiency with Policy Ensemble" (IJCAI 2022)

Abstract 주식 투자 같은 곳에 RL을 사용하기 힘든 이유: noisy observation과 환경의 지속적인 변화. 각각을 해결하려면 sample efficiency가 높아야 하고, generalization도 잘 해야 한다. SL (supervised learning)에서는 앙상블이 정확도도 높아지고 일반화도 잘 하는 걸 생각해보면, RL에도 앙상블을 적용해볼 수 있다. => end-to-end로 앙상블 policy들을 학습하는 EPPO가 등장!! 특히 EPPO는 1. subpolicy들과 ensemble policy를 유기적으로 결합하여 둘 다를 동시에 optimize한다. 2. policy 공간에서 diversity enhancement regularization을 사용해서 [unseen s..

Studies/논문 리뷰 2024. 4. 22. 15:13

Bellman Equation

A Bellman equation, named after Richard E. Bellman, is a necessary condition for optimality associated with the mathematical optimization method known as dynamic programming. It writes the "value" of a decision problem at a certain point in time in terms of the payoff from some initial choices and the "value" of the remaining decision problem thatresults from those initial choices. This breaks a..

Studies/Wiki 한글 번역 2023. 3. 3. 17:16

Radiance

In radiometry, radiance is the radiant flux emitted, reflected, transmitted or received by a given surface, per unit solid angle per unit projected area. Spectral radiance is the radiance of a surface per unit frequency or wavelength, depending on whether the spectrum is taken as a function of frequency or of wavelength. These are directional quantities. 복사 측정술에서, 방사 휘도(輝度)는 주어진 표면에서 방출, 반사, 투과 ..

Studies/Wiki 한글 번역 2022. 9. 28. 12:25

Intrinsic dimension

The intrinsic dimension for a data set can be thought of as the number of variables needed in a minimal representation of the data. Similarly, in signal processing of multidimensional signals, the intrinsic dimension of the signal describes how many variables are needed to generate a good approximation of the signal. 데이터셋의 intrinsic dimension은 데이터의 최소 표현에 필요한 변수의 수로 생각할 수 있다. 마찬가지로, 다차원 신호의 신호 처..

Studies/Wiki 한글 번역 2022. 9. 23. 10:00

Story 04. 리스트 컴프리헨션

이전 글 >> Story 03. 깊은 복사와 얕은 복사 리스트 생성 방법 리스트 컴프리헨션을 사용하여 다음과 같이 리스트를 생성할 수 있다. >>> r1 = [1, 2, 3, 4, 5] >>> r2 = [x * 2 for x in r1] >>> r3 = [x + 10 for x in r1] 조건 필터 추가하기 >>> r1 = [1, 2, 3, 4, 5] >>> r2 = [x * 2 for x in r1 if x % 2] >>> r2 [2, 6, 10] 리스트 컴프리헨션에 for 한 번 더 들어가는 경우 >>> r1 = ['s', 't'] >>> r2 = ['1', '2', '3'] >>> r3 = [i+j for i in r1 for j in r2] >>> r3 ['s1', 's2', 's3', 't1'..

Studies/윤성우의 열혈 파이썬 중급편 2022. 7. 30. 19:54

Story 03. 깊은 복사와 얕은 복사

이전 글 >> Story 02. 수정 가능한 객체와 수정 불가능한 객체 다음 글 >> Story 04. 리스트 컴프리헨션 두 객체의 비교와 복사 객체의 비교 v1 == v2: 변수 v1과 v2가 참조하는 객체의 내용이 같은가? v1 is v2: 변수 v1과 v2가 참조하는 객체는 동일 객체인가? is 연산이 True를 반환하는 경우는 다음과 같다. >>> r1 = [1, 2, 3] >>> r2 = r1# r1이 참조하는 리스트에 r2라는 이름을 하나 더 붙임 >>> r1 is r2 True 다음 예제에서 보이는 객체 복사 결과를 살펴보자. >>> r1 = ['str', ('tu', 'ple'), [1, 2]] >>> r2 = list(r1)# r1의 내용으로 새로운 리스트를 만듦 >>> r1 is r2 ..

Studies/윤성우의 열혈 파이썬 중급편 2022. 7. 29. 23:10

Story 02. 수정 가능한 객체와 수정 불가능한 객체

이전 글 >> Story 01. 레퍼런스 카운트와 가비지 컬렉션 다음 글 >> Story 03. 깊은 복사와 얕은 복사 immutable & mutable immutable 객체 객체가 지닌(객체에 저장된) 값의 수정이 불가능한 객체 ex) 튜플, 문자열 mutable 객체 객체가 지닌 값의 수정이 가능한 객체 ex) 리스트, 딕셔너리 아래 예시는 서로 비슷해보이지만 다르다. >>> r = [1, 2] >>> r += [3, 4] >>> r [1, 2, 3, 4] >>> t = (1, 2) >>> t += (1, 2) >>> t (1, 2, 3, 4) 객체의 주소를 반환하는 id() 함수를 이용하여 확인해볼 수 있다. Help on built-in function id in module builtins:..

Studies/윤성우의 열혈 파이썬 중급편 2022. 7. 29. 02:06

Prev 1 2 Next

목록Studies (10)

코딩하는 임초얀

티스토리툴바