49일차. Huggingface - Diffusers & 자연어 딥러닝 - Text2Image & uv & LLM

49일 차 회고.

ADsP 자격증 합격증이 나왔고, 채용 공고 하나에서 떨어졌다. 내가 이 분야에 대해서 많이 준비하지 못해서 당연한 결과였지만, 그래도 아쉽긴 했다. 주말 동안은 빅데이터분석기사 공부를 좀 더 하고, 잠도 푹 자야 할 것 같다. 그리고 해커톤 모델 결과가 계속 안 좋게 나와서 이에 대해서 더 생각해봐야 할 것 같다.

1. Diffusers

1-1. Diffusers

이미지, 오디오, 분자의 3D 구조 등을 생성하기 위한 사전 훈련된 diffusion 모델을 제공하는 라이브러리

Diffusers 아키텍처

CLIP Text Encoder
- 사용자가 입력한 텍스트 프롬프트를 토큰화하여 임베딩 벡터로 변환한다.
U-Net Model
- 텍스트 임베딩(Text Embeddings)과 노이즈 이미지(Latents)를 입력으로 받아, 노이즈를 점진적으로 제거한다.
- 텍스트 조건을 반영하여 Latent 공간에서 Conditioned Latents를 생성한다.
Scheduler
- U-Net Model이 노이즈를 제거하는 과정을 여러 번 반복하도록 조율한다.
- 각 스텝마다 적절한 수준의 노이즈를 추가하거나 제거하며 학습을 유도한다.
VAE Decoder
- U-Net Model에서 생성한 Conditioned Latents를 받아 실제 이미지로 디코딩한다.

DiffusionPipeline

Model
- 노이즈가 추가된 샘플을 입력으로 받아, 현재 시간 스텝에서 노이즈가 적은 이미지와의 차이를 나타내는 노이즈 잔차를 예측한다.
- 이를 반복적으로 수행하여 점진적으로 노이즈를 제거해 나간다.
Scheduler
- 모델이 예측한 노이즈 잔차를 기반으로 노이즈가 많은 샘플에서 노이즈가 적은 샘플로 변환하는 과정을 관리한다.
- Timestamp에 따라 적절한 스케줄링을 통해 샘플을 업데이트하고, 전체 노이즈 제거 프로세스를 조율한다.

2. Text2Image

2-1. FLUX

Hugging Face Token

import os

os.environ['HF_TOKEN'] = ''

Install diffusers

!pip install -U diffusers

Pretrained Model

import torch
from diffusers import FluxPipeline

pipe = FluxPipeline.from_pretrained('black-forest-labs/FLUX.1-dev', torch_dtype=torch.bfloat16)
pipe.enable_model_cpu_offload()

prompt = ''

image = pipe(
    prompt=prompt,
    height=1024,
    width=1024,
    guidance_scale=3.5,
    num_inference_steps=50,
    max_sequence_length=512,
    generator=torch.Generator('cpu').manual_seed(0)
).images[0]

3. uv

3-1. uv

Python의 패키지 설치, 관리뿐만 아니라 패키지의 빌드 및 배포까지 지원한다.
Rust로 개발되어 매우 빠른 성능과 안정성을 제공한다.
기본의 pip, setuptools, build, twine 등의 역할을 하나로 통합하여 효율적인 Python 패키징 환경을 제공한다.

3-2. uv 설치 및 사용

설치(Windows)

Powershell을 통해 설치한 후, Powershell을 재실행한다.

powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

확인

uv python list

가상환경

uv venv .venv -p 3.12				# uv 가상환경 생성
.\.venv\Scripts\activate			# 가상환경 접속
uv pip install -r .\requirements.txt		# 의존성 라이브러리 설치

4. LLM

4-1. LLM

LLM(Large Language Model)

딥러닝 알고리즘과 통계 모델링을 활용하여 자연어 처리(NLP) 작업을 수행한다.
대규모 언어 데이터를 사전에 학습하여 문장 구조, 문법, 의미 등을 이해하고 생성할 수 있다.

4-2. ChatGPT

ChatGPT 학습과정

데모 데이터 수집 및 지도 학습(Supervised Learning)
비교 데이터 수집 및 보상 모델 훈련(Reward Model Training)
PPO(Proximal Policy Optimization) 기반의 강화 학습을 통해 보상 모델 정책 최적화

ChatGPT 한계

여전히 오류를 발생시킬 가능성이 존재한다.
- Hallucination(환각) 등 잠재적인 위험 요소가 완전히 해결되지 않았다.
폐쇄적인 GPT-4 기술 보고서
- GPT-4 기술 보고서에 초거대 AI 모델을 개발하는 데 필요한 핵심 정보가 공개되지 않았다.

5. OpenAI API

5-1. OpenAI API

OpenAI API Key 발급

Client 생성

!pip install openai
from openai import OpenAI

client = OpenAI()

OpenAI Key 등록

코드에 직접 명시하는 경우, 안전하지 않다.
- 시스템 환경 변수에 직접 입력하거나 AWS에 등록하는 것이 안전하다.

import os

os.environ['OPENAI_API_KEY'] = ''

텍스트 생성(Text Generation)

completion = client.chat.completions.create(
    model='gpt-4o-mini',
    messages=[
        {
            'role': 'system',
            'content': 'You are a helpful assistant.'
        },
        {
            'role': 'user',
            'content': '용과 싸우는 용감한 기사에 대한 이야기를 써 보세요.'
        }
    ]
)
completion
completion.choices[0].message.content

이미지 생성(Vision)

response = client.images.generate(
    model='dall-e-3',
    prompt='a white siamese cat',
    size='1024x1024',
    quality='standard',
    n=1
)
image_url = response.data[0].url
image_url

이미지 해석

response = client.chat.completions.create(
    model='gpt-4o-mini',
    messages=[
        {
            'role': 'user',
            'content': [
                {
                    'type': 'text',
                    'text': 'What\'s in this image?'
                },
                {
                    'type': 'image_url',
                    'image_url': {
                        'url': image_url
                    },
                },
            ],
        },
    ],
    max_tokens=300
)
response.choices[0].message.content

구조화된 출력(Structured Outputs)

from pydantic import BaseModel

class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]

completion = client.beta.chat.completions.parse(
    model='gpt-4o-mini',
    messages=[
        {
            'role': 'system',
            'content': 'Extract the event information.'
        },
        {
            'role': 'user',
            'content': 'Alice and Bob are going to a science fair on Friday.'
        }
    ],
    response_format=CalendarEvent
)
completion
completion.choice[0].message.parsed

복잡한 추론(Reasoning)

response = client.chat.completions.create(
    model='gpt-4o-mini',
    messages=[
        {
            'role': 'user',
            'content': 'Solve the equation: 5x - 3 = 12'
        }
    ]
)
response
response.choices[0].message.content

Chat

completion = client.chat.completions.create(
    model='gpt-4o-mini',
    messages=[
        {
            'role': 'system',
            'content': '당신은 파이썬 프로그래머입니다.'
        },
        {
            'role': 'user',
            'content': '피보나치 수열을 생성하는 파이썬 프로그램을 작성해주세요.'
        }
    ]
)
completion
completion.choices[0].message.content

Streaming

completion = client.chat.completions.create(
    model='gpt-4o-mini',
    messages=[
        {
            'role': 'system',
            'content': '당신은 파이썬 프로그래머입니다.'
        },
        {
            'role': 'user',
            'content': '피보나치 수열을 생성하는 파이썬 프로그램을 작성해주세요.'
        }
    ],
    stream=True
)

for chunk in completion:
    print(chunk.choices[0].delta.content, end='', flush=True)

이전 대화내용 기억(대화의 연속성)

def ask(question, message_history=[], model='gpt-3.5-turbo'):
    if len(message_history) == 0:
        message_history.append({
            'role': 'system',
            'content': 'You are a helpful assistant. You must answer in Korean.'
        })
    
    message_history.append({
        'role': 'user',
        'content': question
    })
    
    completion = client.chat.completions.create(
        model=model,
        messages=message_history
    )
    
    message-history.append({
        'role': 'assistant',
        'content': completion.choices[0].message.content
    })
    
    return message history

message_history = ask('대한민국의 수도는 어디인가요?', message_history=[])
message_history[-1]
# {'role': 'assistant', 'content': '대한민국의 수도는 서울입니다.'}

message_history = ask('이전 내용을 영어로 답변해 주세요.', message_history=message_history)
message_history[-1]
# {'role': 'assistant', 'content': 'The capital of South Korea is Seoul.'}

JSON Object 형식 답변

import pandas as pd

response = client.chat.completions.create(
    model='gpt-3.5-turbo-1106',
    messages=[
        {
            'role': 'system',
            'content': 'You are a helpful assistant designed to output JSON.'
        },
        {
            'role': 'user',
            'content': '통계를 주제로 4지선다형 객관식 문제를 만들어주세요. 정답은 index 번호로 알려주세요. ' +
                       '난이도는 [상, 중, 하] 중 하나로 표기해 주세요.'
        }
    ],
    response_format={'type': 'json_object'},
    temperature=0.5,
    max_tokens=300,
    n=5
)
for res in response.choices:
    res.message.content

json_result = [json.loads(res.message.content) for res in response.choices]

pd.DataFrame(json_result)

6. LLM 프로젝트

6-1. Git & Github

새로운 프로젝트 생성
- LLM_Tutorial
Initialize Repository
Branch 생성
- develop branch
- feature-openai branch - 개발 진행

6-2. 개발

.gitignore

.venv
.env

.env

OPENAI_API_KEY=

requirements.txt

python-dotenv
streamlit
openai
jupyter

가상환경

uv venv .venv -p 3.12
.\.venv\Scripts\activate
uv pip install -r .\requirements.txt

chatbot/app.py

from openai import OpenAI
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Create OpenAI client
client = OpenAI()

# User input
prompt = st.chat_input('Write something...')
if prompt:
    message = [
        {
            'role': 'system',
            'content': 'You are an AI Developer.'
        },
        {
            'role': 'user',
            'content': prompt
        }
    ]
    completion = client.chat.completions.create(
        model='gpt-4o-mini',
        messages=message
    )

    # Display response
    st.write(completion.choices[0].message.content)

'SK네트웍스 Family AI캠프 10기 > Daily 회고' 카테고리의 다른 글

51일차. LLM - LLaMA & Claude & SciSpace & LLM 프로젝트 (0)	2025.03.25
50일차. LLM - LLM 프로젝트(Chatbot) (0)	2025.03.24
48일차. Vision - Generative Model & 자연어 딥러닝 - Image2Text (1)	2025.03.20
47일차. 자연어 딥러닝 - Transformer & 자연어-이미지 멀티모달 - OCR(CRNN) & Vision - Generative Model (0)	2025.03.19
46일차. 자연어 딥러닝 - Seq2Seq & Attention (0)	2025.03.18

이네의 개발 노트

49일차. Huggingface - Diffusers & 자연어 딥러닝 - Text2Image & uv & LLM - OpenAI

1. Diffusers

1-1. Diffusers

2. Text2Image

2-1. FLUX

3. uv

3-1. uv

3-2. uv 설치 및 사용

4. LLM

4-1. LLM

4-2. ChatGPT

5. OpenAI API

5-1. OpenAI API

6. LLM 프로젝트

6-1. Git & Github

6-2. 개발

'SK네트웍스 Family AI캠프 10기 > Daily 회고' 카테고리의 다른 글

티스토리툴바

49일차. Huggingface - Diffusers & 자연어 딥러닝 - Text2Image & uv & LLM - OpenAI

1. Diffusers

1-1. Diffusers

2. Text2Image

2-1. FLUX

3. uv

3-1. uv

3-2. uv 설치 및 사용

4. LLM

4-1. LLM

4-2. ChatGPT

5. OpenAI API

5-1. OpenAI API

6. LLM 프로젝트

6-1. Git & Github

6-2. 개발

'SK네트웍스 Family AI캠프 10기 > Daily 회고' 카테고리의 다른 글

'SK네트웍스 Family AI캠프 10기/Daily 회고' Related Articles

티스토리툴바