Operaciones sobre datos en Pandas
Ufuncs: preservacion de los índices¶
In [1]:
import pandas as pd
import numpy as np
In [2]:
rng = np.random.RandomState(42)
ser = pd.Series(rng.randint(0, 10, 4))
ser
Out[2]:
In [3]:
df = pd.DataFrame(rng.randint(0, 10, (3, 4)),
columns=['A', 'B', 'C', 'D'])
df
Out[3]:
In [4]:
np.exp(ser)
Out[4]:
In [5]:
np.sin(df * np.pi / 4)
Out[5]:
UFuncs: alineación de los índices¶
en Series¶
In [6]:
area = pd.Series({'Alaska': 1723337, 'Texas': 695662,
'California': 423967}, name='area')
population = pd.Series({'California': 38332521, 'Texas': 26448193,
'New York': 19651127}, name='population')
Let's see what happens when we divide these to compute the population density:
In [7]:
population / area
Out[7]:
In [8]:
area.index | population.index
Out[8]:
In [9]:
A = pd.Series([2, 4, 6], index=[0, 1, 2])
B = pd.Series([1, 3, 5], index=[1, 2, 3])
A + B
Out[9]:
In [10]:
A.add(B, fill_value=0)
Out[10]:
en DataFrame¶
In [11]:
A = pd.DataFrame(rng.randint(0, 20, (2, 2)),
columns=list('AB'))
A
Out[11]:
In [12]:
B = pd.DataFrame(rng.randint(0, 10, (3, 3)),
columns=list('BAC'))
B
Out[12]:
In [13]:
A + B
Out[13]:
In [14]:
fill = A.stack().mean()
A.add(B, fill_value=fill)
Out[14]:
Python Operator | Pandas Method(s) |
---|---|
+ |
add() |
- |
sub() , subtract() |
* |
mul() , multiply() |
/ |
truediv() , div() , divide() |
// |
floordiv() |
% |
mod() |
** |
pow() |
Ufuncs: Operaciones entre DataFrame y Series¶
In [15]:
A = rng.randint(10, size=(3, 4))
A
Out[15]:
In [16]:
A - A[0]
Out[16]:
In [17]:
df = pd.DataFrame(A, columns=list('QRST'))
df - df.iloc[0]
Out[17]:
In [18]:
df.subtract(df['R'], axis=0)
Out[18]:
In [19]:
halfrow = df.iloc[0, ::2]
halfrow
Out[19]:
In [20]:
df - halfrow
Out[20]: