train_label2idx = {}
test_label2idx = {}
for i, label in enumerate(train_labels):
if label not in train_label2idx:
train_label2idx[label] = [i]
else:
train_label2idx[label].append(i)
for i, label in enumerate(test_labels):
if label not in test_label2idx:
test_label2idx[label] = [i]
else:
test_label2idx[label].append(i)
train_positives = []
train_negatives = []
test_positives = []
test_negatives = []
for label, indices in train_label2idx.items():
other_indices = set(range(n_train)) - set(indices)
for i , idx1 in enumerate(indices):
for idx2 in indices[i+1:]:
train_positives.append((idx1, idx2))
for idx2 in other_indices:
train_negatives.append((idx1, idx2))
for label, indices in test_label2idx.items():
other_indices = set(range(n_test)) - set(indices)
for i , idx1 in enumerate(indices):
for idx2 in indices[i+1:]:
test_positives.append((idx1, idx2))
for idx2 in other_indices:
test_negatives.append((idx1, idx2))
아래 내용은 Udemy에서 Pytorch: Deep Learning and Artificial Intelligence를 보고 정리한 내용이다.
Reinforcement Learning
machine learning to the stock market
supervised learning 은 prediction 만 하고 , but you must still take the action
RNNs rather than RL
Actions = buy / sell/hold
state = stock prices/ # shares owned / amount of cash i have
reward = some function of protfolio value gained/ lost
build environment
state will consider of 3parts:
1. how many shares of each stock i own
2. current price of each stock
3. how much cash we have(uninvested)
action
buy /sell / hold(do nothing)
3가지 주석 고려할 경우 3^3 => 27가능하다.
eg. sell , sell, sell
pop(0) => index 0
states(NxD array)
actions(N array)
rewards(N array)
Next states(N x D array)
Done flas(N array)
train and test
environment
agent
google colab에서 하면 local 보다 너무 늦어서 다른데서 하는 것이 좋다.
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.nn.functional as F
from datetime import datetime
import itertools
import argparse
import re
import os
import pickle
from sklearn.preprocessing import StandardScaler