(Python) - 3 파이썬 교육

개발/Python 2024. 8. 26. 12:51

■ numpy 지난 내용 복습하기

■ pandas 지난 내용 복습하기

df.index : index 정보 나타내기

※ 판다스에서 나오는 object 는 객체가 아니라 string 을 의미한다.

df 변수에 다시 저장하거나 inplace=True 를 하면 덮어써서 df 가 나온다.

■ plotly

axis = 0(행) 아래 방향

axis = 1(열) 옆 방향

많이 쓰는 매서드

■ 주피터 노트로 실습하기

pip.list : 설치된 모듈 보기

findstr + 모듈명 : 모듈 찾아보기

country 컬럼에 해당하는 내용 출력하기

아프가니스탄만 있는지 True False 로 나옴

두 개의 값 있는지 확인하기

isin() 함수 또는 18번행 코드

df.query() 사용해도 나온다.

아프가니스탄이면서 2002년인 데이터 출력하기

아프가니스탄, 짐바브웨 이면서 2000 초과 데이터

한국 데이터를 찾기 위해 먼저 Asia 데이터를 df 로 저장한다.

unique()를 통해 저장된 데이터 이름을 찾는다.

한국은 다음의 이름으로 저장돼 있다.

df 를 한국 데이터로만 저장하고,

편의를 위해 Korea, Dem. Rep. 이름을 Korea 로 바꿔준다.

matplotlib.pyplot 를 불러와서

plt.plot('year','pop', data=df) 를 통해 데이터 시각화가 가능하다.

판다스는 본인의 데이터 프레임 안에 plot 가 들어가길 위해 다음과 같이 처리할 수 있도록 만들었다.

df.plot(kind='barh', x='year',y=['pop'])

kind 는 그래프 모양을 정할 수 있고,

x축 y축을 각각 설정해서 볼 수 있다.

■ seaborn

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
sns.set_theme(style="white", rc={"axes.facecolor": (0, 0, 0, 0)})

# Create the data
rs = np.random.RandomState(1979)
x = rs.randn(500)
g = np.tile(list("ABCDEFGHIJ"), 50)
df = pd.DataFrame(dict(x=x, g=g))
m = df.g.map(ord)
df["x"] += m

# Initialize the FacetGrid object
pal = sns.cubehelix_palette(10, rot=-.25, light=.7)
g = sns.FacetGrid(df, row="g", hue="g", aspect=15, height=.5, palette=pal)

# Draw the densities in a few steps
g.map(sns.kdeplot, "x",
      bw_adjust=.5, clip_on=False,
      fill=True, alpha=1, linewidth=1.5)
g.map(sns.kdeplot, "x", clip_on=False, color="w", lw=2, bw_adjust=.5)

# passing color=None to refline() uses the hue mapping
g.refline(y=0, linewidth=2, linestyle="-", color=None, clip_on=False)


# Define and use a simple function to label the plot in axes coordinates
def label(x, color, label):
    ax = plt.gca()
    ax.text(0, .2, label, fontweight="bold", color=color,
            ha="left", va="center", transform=ax.transAxes)


g.map(label, "x")

# Set the subplots to overlap
g.figure.subplots_adjust(hspace=-.25)

# Remove axes details that don't play well with overlap
g.set_titles("")
g.set(yticks=[], ylabel="")
g.despine(bottom=True, left=True)