Browse Source

Pandas

pull/152/head
Jure Šorn 1 year ago
parent
commit
4450c119ad
2 changed files with 26 additions and 26 deletions
  1. 26
      README.md
  2. 26
      index.html

26
README.md

@ -3120,7 +3120,6 @@ Pandas
```python
# $ pip3 install pandas matplotlib
import pandas as pd
from pandas import Series, DataFrame
import matplotlib.pyplot as plt
```
@ -3128,16 +3127,16 @@ import matplotlib.pyplot as plt
**Ordered dictionary with a name.**
```python
>>> Series([1, 2], index=['x', 'y'], name='a')
>>> pd.Series([1, 2], index=['x', 'y'], name='a')
x 1
y 2
Name: a, dtype: int64
```
```python
<Sr> = Series(<list>) # Assigns RangeIndex starting at 0.
<Sr> = Series(<dict>) # Takes dictionary's keys for index.
<Sr> = Series(<dict/Series>, index=<list>) # Only keeps items with keys specified in index.
<Sr> = pd.Series(<list>) # Assigns RangeIndex starting at 0.
<Sr> = pd.Series(<dict>) # Takes dictionary's keys for index.
<Sr> = pd.Series(<dict/Series>, index=<list>) # Only keeps items with keys specified in index.
```
```python
@ -3176,7 +3175,7 @@ plt.show() # Displays the plot. Also plt.sav
```
```python
>>> sr = Series([1, 2], index=['x', 'y'])
>>> sr = pd.Series([1, 2], index=['x', 'y'])
x 1
y 2
```
@ -3201,20 +3200,21 @@ y 2
```
* **Methods ffill(), interpolate() and fillna() accept argument 'inplace' that defaults to False.**
* **Last result has a hierarchical index. Use `'<Sr>[key_1, key_2]'` to get its values.**
* **Keys, indexes and bools can't be tuples because `'obj[x, y]'` becomes `'obj[(x, y)]'`.**
### DataFrame
**Table with labeled rows and columns.**
```python
>>> DataFrame([[1, 2], [3, 4]], index=['a', 'b'], columns=['x', 'y'])
>>> pd.DataFrame([[1, 2], [3, 4]], index=['a', 'b'], columns=['x', 'y'])
x y
a 1 2
b 3 4
```
```python
<DF> = DataFrame(<list_of_rows>) # Rows can be either lists, dicts or series.
<DF> = DataFrame(<dict_of_columns>) # Columns can be either lists, dicts or series.
<DF> = pd.DataFrame(<list_of_rows>) # Rows can be either lists, dicts or series.
<DF> = pd.DataFrame(<dict_of_columns>) # Columns can be either lists, dicts or series.
```
```python
@ -3244,11 +3244,11 @@ b 3 4
#### DataFrame — Merge, Join, Concat:
```python
>>> l = DataFrame([[1, 2], [3, 4]], index=['a', 'b'], columns=['x', 'y'])
>>> l = pd.DataFrame([[1, 2], [3, 4]], index=['a', 'b'], columns=['x', 'y'])
x y
a 1 2
b 3 4
>>> r = DataFrame([[4, 5], [6, 7]], index=['b', 'c'], columns=['y', 'z'])
>>> r = pd.DataFrame([[4, 5], [6, 7]], index=['b', 'c'], columns=['y', 'z'])
y z
b 4 5
c 6 7
@ -3295,7 +3295,7 @@ c 6 7
* **All operations operate on columns by default. Pass `'axis=1'` to process the rows instead.**
```python
>>> df = DataFrame([[1, 2], [3, 4]], index=['a', 'b'], columns=['x', 'y'])
>>> df = pd.DataFrame([[1, 2], [3, 4]], index=['a', 'b'], columns=['x', 'y'])
x y
a 1 2
b 3 4
@ -3347,7 +3347,7 @@ plt.show() # Displays the plot. Also plt.sav
**Object that groups together rows of a dataframe based on the value of the passed column.**
```python
>>> df = DataFrame([[1, 2, 3], [4, 5, 6], [7, 8, 6]], index=list('abc'), columns=list('xyz'))
>>> df = pd.DataFrame([[1, 2, 3], [4, 5, 6], [7, 8, 6]], list('abc'), list('xyz'))
>>> df.groupby('z').get_group(6)
x y
b 4 5

26
index.html

@ -2555,20 +2555,19 @@ W, H, MAX_S = <span class="hljs-number">50</span>, <span class="hljs-number">50<
<div><h2 id="pandas"><a href="#pandas" name="pandas">#</a>Pandas</h2><pre><code class="python language-python hljs"><span class="hljs-comment"># $ pip3 install pandas matplotlib</span>
<span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd
<span class="hljs-keyword">from</span> pandas <span class="hljs-keyword">import</span> Series, DataFrame
<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> plt
</code></pre></div>
<div><h3 id="series">Series</h3><p><strong>Ordered dictionary with a name.</strong></p><pre><code class="python language-python hljs"><span class="hljs-meta">&gt;&gt;&gt; </span>Series([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>], index=[<span class="hljs-string">'x'</span>, <span class="hljs-string">'y'</span>], name=<span class="hljs-string">'a'</span>)
<div><h3 id="series">Series</h3><p><strong>Ordered dictionary with a name.</strong></p><pre><code class="python language-python hljs"><span class="hljs-meta">&gt;&gt;&gt; </span>pd.Series([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>], index=[<span class="hljs-string">'x'</span>, <span class="hljs-string">'y'</span>], name=<span class="hljs-string">'a'</span>)
x <span class="hljs-number">1</span>
y <span class="hljs-number">2</span>
Name: a, dtype: int64
</code></pre></div>
<pre><code class="python language-python hljs">&lt;Sr&gt; = Series(&lt;list&gt;) <span class="hljs-comment"># Assigns RangeIndex starting at 0.</span>
&lt;Sr&gt; = Series(&lt;dict&gt;) <span class="hljs-comment"># Takes dictionary's keys for index.</span>
&lt;Sr&gt; = Series(&lt;dict/Series&gt;, index=&lt;list&gt;) <span class="hljs-comment"># Only keeps items with keys specified in index.</span>
<pre><code class="python language-python hljs">&lt;Sr&gt; = pd.Series(&lt;list&gt;) <span class="hljs-comment"># Assigns RangeIndex starting at 0.</span>
&lt;Sr&gt; = pd.Series(&lt;dict&gt;) <span class="hljs-comment"># Takes dictionary's keys for index.</span>
&lt;Sr&gt; = pd.Series(&lt;dict/Series&gt;, index=&lt;list&gt;) <span class="hljs-comment"># Only keeps items with keys specified in index.</span>
</code></pre>
<pre><code class="python language-python hljs">&lt;el&gt; = &lt;Sr&gt;.loc[key] <span class="hljs-comment"># Or: &lt;Sr&gt;.iloc[index]</span>
&lt;Sr&gt; = &lt;Sr&gt;.loc[keys] <span class="hljs-comment"># Or: &lt;Sr&gt;.iloc[indexes]</span>
@ -2593,7 +2592,7 @@ plt.show() <span class="hljs-comment"># Disp
&lt;Sr&gt; = &lt;Sr&gt;.fillna(&lt;el&gt;) <span class="hljs-comment"># Or: &lt;Sr&gt;.agg/transform/map(lambda &lt;el&gt;: &lt;el&gt;)</span>
</code></pre></div>
<pre><code class="python language-python hljs"><span class="hljs-meta">&gt;&gt;&gt; </span>sr = Series([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>], index=[<span class="hljs-string">'x'</span>, <span class="hljs-string">'y'</span>])
<pre><code class="python language-python hljs"><span class="hljs-meta">&gt;&gt;&gt; </span>sr = pd.Series([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>], index=[<span class="hljs-string">'x'</span>, <span class="hljs-string">'y'</span>])
x <span class="hljs-number">1</span>
y <span class="hljs-number">2</span>
</code></pre>
@ -2616,16 +2615,17 @@ y <span class="hljs-number">2</span>
<ul>
<li><strong>Methods ffill(), interpolate() and fillna() accept argument 'inplace' that defaults to False.</strong></li>
<li><strong>Last result has a hierarchical index. Use <code class="python hljs"><span class="hljs-string">'&lt;Sr&gt;[key_1, key_2]'</span></code> to get its values.</strong></li>
<li><strong>Keys, indexes and bools can't be tuples because <code class="python hljs"><span class="hljs-string">'obj[x, y]'</span></code> becomes <code class="python hljs"><span class="hljs-string">'obj[(x, y)]'</span></code>.</strong></li>
</ul>
<div><h3 id="dataframe">DataFrame</h3><p><strong>Table with labeled rows and columns.</strong></p><pre><code class="python language-python hljs"><span class="hljs-meta">&gt;&gt;&gt; </span>DataFrame([[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>], [<span class="hljs-number">3</span>, <span class="hljs-number">4</span>]], index=[<span class="hljs-string">'a'</span>, <span class="hljs-string">'b'</span>], columns=[<span class="hljs-string">'x'</span>, <span class="hljs-string">'y'</span>])
<div><h3 id="dataframe">DataFrame</h3><p><strong>Table with labeled rows and columns.</strong></p><pre><code class="python language-python hljs"><span class="hljs-meta">&gt;&gt;&gt; </span>pd.DataFrame([[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>], [<span class="hljs-number">3</span>, <span class="hljs-number">4</span>]], index=[<span class="hljs-string">'a'</span>, <span class="hljs-string">'b'</span>], columns=[<span class="hljs-string">'x'</span>, <span class="hljs-string">'y'</span>])
x y
a <span class="hljs-number">1</span> <span class="hljs-number">2</span>
b <span class="hljs-number">3</span> <span class="hljs-number">4</span>
</code></pre></div>
<pre><code class="python language-python hljs">&lt;DF&gt; = DataFrame(&lt;list_of_rows&gt;) <span class="hljs-comment"># Rows can be either lists, dicts or series.</span>
&lt;DF&gt; = DataFrame(&lt;dict_of_columns&gt;) <span class="hljs-comment"># Columns can be either lists, dicts or series.</span>
<pre><code class="python language-python hljs">&lt;DF&gt; = pd.DataFrame(&lt;list_of_rows&gt;) <span class="hljs-comment"># Rows can be either lists, dicts or series.</span>
&lt;DF&gt; = pd.DataFrame(&lt;dict_of_columns&gt;) <span class="hljs-comment"># Columns can be either lists, dicts or series.</span>
</code></pre>
<pre><code class="python language-python hljs">&lt;el&gt; = &lt;DF&gt;.loc[row_key, column_key] <span class="hljs-comment"># Or: &lt;DF&gt;.iloc[row_index, column_index]</span>
&lt;Sr/DF&gt; = &lt;DF&gt;.loc[row_key/s] <span class="hljs-comment"># Or: &lt;DF&gt;.iloc[row_index/es]</span>
@ -2644,11 +2644,11 @@ b <span class="hljs-number">3</span> <span class="hljs-number">4</span>
&lt;DF&gt; = &lt;DF&gt;.sort_index(ascending=<span class="hljs-keyword">True</span>) <span class="hljs-comment"># Sorts rows by row keys. Use `axis=1` for cols.</span>
&lt;DF&gt; = &lt;DF&gt;.sort_values(column_key/s) <span class="hljs-comment"># Sorts rows by the passed column/s. Same.</span>
</code></pre>
<div><h4 id="dataframemergejoinconcat">DataFrame — Merge, Join, Concat:</h4><pre><code class="python language-python hljs"><span class="hljs-meta">&gt;&gt;&gt; </span>l = DataFrame([[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>], [<span class="hljs-number">3</span>, <span class="hljs-number">4</span>]], index=[<span class="hljs-string">'a'</span>, <span class="hljs-string">'b'</span>], columns=[<span class="hljs-string">'x'</span>, <span class="hljs-string">'y'</span>])
<div><h4 id="dataframemergejoinconcat">DataFrame — Merge, Join, Concat:</h4><pre><code class="python language-python hljs"><span class="hljs-meta">&gt;&gt;&gt; </span>l = pd.DataFrame([[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>], [<span class="hljs-number">3</span>, <span class="hljs-number">4</span>]], index=[<span class="hljs-string">'a'</span>, <span class="hljs-string">'b'</span>], columns=[<span class="hljs-string">'x'</span>, <span class="hljs-string">'y'</span>])
x y
a <span class="hljs-number">1</span> <span class="hljs-number">2</span>
b <span class="hljs-number">3</span> <span class="hljs-number">4</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>r = DataFrame([[<span class="hljs-number">4</span>, <span class="hljs-number">5</span>], [<span class="hljs-number">6</span>, <span class="hljs-number">7</span>]], index=[<span class="hljs-string">'b'</span>, <span class="hljs-string">'c'</span>], columns=[<span class="hljs-string">'y'</span>, <span class="hljs-string">'z'</span>])
<span class="hljs-meta">&gt;&gt;&gt; </span>r = pd.DataFrame([[<span class="hljs-number">4</span>, <span class="hljs-number">5</span>], [<span class="hljs-number">6</span>, <span class="hljs-number">7</span>]], index=[<span class="hljs-string">'b'</span>, <span class="hljs-string">'c'</span>], columns=[<span class="hljs-string">'y'</span>, <span class="hljs-string">'z'</span>])
y z
b <span class="hljs-number">4</span> <span class="hljs-number">5</span>
c <span class="hljs-number">6</span> <span class="hljs-number">7</span>
@ -2692,7 +2692,7 @@ c <span class="hljs-number">6</span> <span class="hljs-number">7</span>
<ul>
<li><strong>All operations operate on columns by default. Pass <code class="python hljs"><span class="hljs-string">'axis=1'</span></code> to process the rows instead.</strong></li>
</ul>
<pre><code class="python language-python hljs"><span class="hljs-meta">&gt;&gt;&gt; </span>df = DataFrame([[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>], [<span class="hljs-number">3</span>, <span class="hljs-number">4</span>]], index=[<span class="hljs-string">'a'</span>, <span class="hljs-string">'b'</span>], columns=[<span class="hljs-string">'x'</span>, <span class="hljs-string">'y'</span>])
<pre><code class="python language-python hljs"><span class="hljs-meta">&gt;&gt;&gt; </span>df = pd.DataFrame([[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>], [<span class="hljs-number">3</span>, <span class="hljs-number">4</span>]], index=[<span class="hljs-string">'a'</span>, <span class="hljs-string">'b'</span>], columns=[<span class="hljs-string">'x'</span>, <span class="hljs-string">'y'</span>])
x y
a <span class="hljs-number">1</span> <span class="hljs-number">2</span>
b <span class="hljs-number">3</span> <span class="hljs-number">4</span>
@ -2732,7 +2732,7 @@ plt.show() <span class="hljs-comment"># Disp
&lt;DF&gt;.to_pickle/excel(&lt;path&gt;) <span class="hljs-comment"># Run `$ pip3 install openpyxl` for xlsx files.</span>
&lt;DF&gt;.to_sql(<span class="hljs-string">'&lt;table_name&gt;'</span>, &lt;connection&gt;) <span class="hljs-comment"># Accepts SQLite3 or SQLAlchemy connection.</span>
</code></pre>
<div><h3 id="groupby">GroupBy</h3><p><strong>Object that groups together rows of a dataframe based on the value of the passed column.</strong></p><pre><code class="python language-python hljs"><span class="hljs-meta">&gt;&gt;&gt; </span>df = DataFrame([[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>], [<span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>], [<span class="hljs-number">7</span>, <span class="hljs-number">8</span>, <span class="hljs-number">6</span>]], index=list(<span class="hljs-string">'abc'</span>), columns=list(<span class="hljs-string">'xyz'</span>))
<div><h3 id="groupby">GroupBy</h3><p><strong>Object that groups together rows of a dataframe based on the value of the passed column.</strong></p><pre><code class="python language-python hljs"><span class="hljs-meta">&gt;&gt;&gt; </span>df = pd.DataFrame([[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>], [<span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>], [<span class="hljs-number">7</span>, <span class="hljs-number">8</span>, <span class="hljs-number">6</span>]], list(<span class="hljs-string">'abc'</span>), list(<span class="hljs-string">'xyz'</span>))
<span class="hljs-meta">&gt;&gt;&gt; </span>df.groupby(<span class="hljs-string">'z'</span>).get_group(<span class="hljs-number">6</span>)
x y
b <span class="hljs-number">4</span> <span class="hljs-number">5</span>

Loading…
Cancel
Save