시계열 예측

기본 예측 기법

실습 데이터 준비: (맥주 생산량)

beer = pd.read_excel('beer.xlsx', parse_dates=True, index_col='quarter')
# 예측할 미래 시점(20분기) 인덱스 생성
future = pd.date_range(beer.index[-1], periods=20, freq='Q') # 'Q': 분기말 빈도

평균 기법 (Average Method): 과거 전체 데이터 평균으로 미래 값 예측.

pred_avg = pd.DataFrame({
    'production': beer.production.mean()}, 
    index=future)
beer.production.plot()
pred_avg.production.plot()

단순 기법 (Naïve Method): 마지막 관측 값으로 미래 값 예측. (랜덤 워크 시 최적)

pred_naive = pd.DataFrame({
    'production': beer.production.iloc[-1]}, 
    index=future)
beer.production.plot()
pred_naive.production.plot()

계절성 단순 기법 (Seasonal Naïve Method): 마지막 동일 계절 값으로 예측.

import numpy as np
# 마지막 1년(4분기) 값 반복 사용 (np.tile: 배열 반복)
last_season = beer.production.iloc[-4:]
pred_seasonal_naive = pd.DataFrame({
    'production': np.tile(last_season, 5)}, 
    index=future) # 20분기=5년
beer.production.plot()
pred_seasonal_naive.production.plot()

표류 기법 (Drift Method): 마지막 관측 값 + 과거 평균 변화량(추세) 적용 예측.

num = len(beer.production) - 1
drift = (beer.production.iloc[-1] - beer.production.iloc[0]) / num # 전체 기간 평균 변화량
last = beer.production.iloc[-1]
pred_drift = pd.DataFrame({
    'production': last + np.arange(1, 21) * drift}, 
    index=future) # 1~20 시점 예측
beer.production.plot()
pred_drift.production.plot()

표류 + 계절성 단순 기법 결합: 계절성 예측값 + 표류 예측값.

pred_drift_seasonal = pd.DataFrame(
    {'production': pred_seasonal_naive['production'] + np.arange(1, 21) * drift}, 
    index=future)
beer.production.plot()
pred_drift_seasonal.production.plot()

기본 예측 기법​

퀴즈​

기본 예측 기법

퀴즈