Browse Source

CSV, Pandas a lot of changes

pull/192/head
Jure Šorn 2 months ago
parent
commit
1e590add20
2 changed files with 14 additions and 14 deletions
  1. 12
      README.md
  2. 16
      index.html

12
README.md

@ -1844,8 +1844,8 @@ import csv
### Parameters
* **`'dialect'` - Master parameter that sets the default values. String or a 'csv.Dialect' object.**
* **`'delimiter'` - A one-character string used to separate fields.**
* **`'lineterminator'` - How writer terminates rows. Reader is hardcoded to '\n', '\r', '\r\n'.**
* **`'quotechar'` - Character for quoting fields that contain special characters.**
* **`'lineterminator'` - How writer terminates rows. Reader looks for '\n', '\r' and '\r\n'.**
* **`'quotechar'` - Character for quoting fields containing delimiters, quotechars, '\n' or '\r'.**
* **`'escapechar'` - Character for escaping quotechars.**
* **`'doublequote'` - Whether quotechars inside fields are/get doubled or escaped.**
* **`'quoting'` - 0: As necessary, 1: All, 2: All but numbers which are read as floats, 3: None.**
@ -3186,14 +3186,14 @@ Name: a, dtype: int64
```python
<S> = <S>.head/describe/sort_values() # Also <S>.unique/value_counts/round/dropna().
<S> = <S>.str.strip/lower/contains/replace() # Also split().str[<int>] and split().explode().
<S> = <S>.str.strip/lower/contains/replace() # Also split().str[i] or split(expand=True).
<S> = <S>.dt.year/month/day/hour # Use pd.to_datetime(<S>) to get S of dates.
```
```python
<S>.plot.line/area/bar/pie/hist() # Generates a plot. `plt.show()` displays it.
```
* **Also: `'pd.cut(<S>, bins=<int/coll>)'` and `'<S>.quantile(<float/coll>)'`.**
* **Also `'<S>.quantile(<float/coll>)'` and `'pd.cut(<S>, bins=<int/coll>)'`.**
* **Indexing objects can't be tuples because `'obj[x, y]'` is converted to `'obj[(x, y)]'`.**
* **Pandas uses NumPy types like `'np.int64'`. Series is converted to `'float64'` if we assign np.nan to any item. Use `'<S>.astype(<str/type>)'` to get converted Series.**
* **Series will silently overflow if we run `'pd.Series([100], dtype="int8") + 100'`!**
@ -3255,7 +3255,7 @@ b 3 4
```
```python
<DF> = <DF> > <el/S/DF> # Returns DF of bools. S is treated as a row.
<DF> = <DF> > <el/S/DF> # Returns DF of bools. Treats series as a row.
<DF> = <DF> + <el/S/DF> # Items with non-matching keys get value NaN.
```
@ -3338,7 +3338,7 @@ c 6 7
| | b 2.0 2.0 | b 2.0 2.0 | b 2.0 |
+-----------------+---------------+---------------+---------------+
```
* **All methods operate on columns by default. Pass `'axis=1'` to process the rows instead.**
* **Listed methods process the columns unless they receive `'axis=1'`. Exceptions to this rule are `'<DF>.dropna()'`, `'<DF>.drop(row_key/s)'` and `'<DF>.rename(<dict/func>)'`.**
* **Fifth result's columns are indexed with a multi-index. This means we need a tuple of column keys to specify a column: `'<DF>.loc[row_key, (col_key_1, col_key_2)]'`.**
### Multi-Index

16
index.html

@ -55,7 +55,7 @@
<body>
<header>
<aside>February 4, 2025</aside>
<aside>February 5, 2025</aside>
<a href="https://gto76.github.io" rel="author">Jure Šorn</a>
</header>
@ -1529,8 +1529,8 @@ CompletedProcess(args=[<span class="hljs-string">'bc'</span>, <span class="hljs-
<div><h3 id="parameters">Parameters</h3><ul>
<li><strong><code class="python hljs"><span class="hljs-string">'dialect'</span></code> - Master parameter that sets the default values. String or a 'csv.Dialect' object.</strong></li>
<li><strong><code class="python hljs"><span class="hljs-string">'delimiter'</span></code> - A one-character string used to separate fields.</strong></li>
<li><strong><code class="python hljs"><span class="hljs-string">'lineterminator'</span></code> - How writer terminates rows. Reader is hardcoded to '\n', '\r', '\r\n'.</strong></li>
<li><strong><code class="python hljs"><span class="hljs-string">'quotechar'</span></code> - Character for quoting fields that contain special characters.</strong></li>
<li><strong><code class="python hljs"><span class="hljs-string">'lineterminator'</span></code> - How writer terminates rows. Reader looks for '\n', '\r' and '\r\n'.</strong></li>
<li><strong><code class="python hljs"><span class="hljs-string">'quotechar'</span></code> - Character for quoting fields containing delimiters, quotechars, '\n' or '\r'.</strong></li>
<li><strong><code class="python hljs"><span class="hljs-string">'escapechar'</span></code> - Character for escaping quotechars.</strong></li>
<li><strong><code class="python hljs"><span class="hljs-string">'doublequote'</span></code> - Whether quotechars inside fields are/get doubled or escaped.</strong></li>
<li><strong><code class="python hljs"><span class="hljs-string">'quoting'</span></code> - 0: As necessary, 1: All, 2: All but numbers which are read as floats, 3: None.</strong></li>
@ -2595,13 +2595,13 @@ Name: a, dtype: int64
&lt;S&gt; = &lt;S&gt; + &lt;el/S&gt; <span class="hljs-comment"># Items with non-matching keys get value NaN.</span>
</code></pre>
<pre><code class="python language-python hljs">&lt;S&gt; = &lt;S&gt;.head/describe/sort_values() <span class="hljs-comment"># Also &lt;S&gt;.unique/value_counts/round/dropna().</span>
&lt;S&gt; = &lt;S&gt;.str.strip/lower/contains/replace() <span class="hljs-comment"># Also split().str[&lt;int&gt;] and split().explode().</span>
&lt;S&gt; = &lt;S&gt;.str.strip/lower/contains/replace() <span class="hljs-comment"># Also split().str[i] or split(expand=True).</span>
&lt;S&gt; = &lt;S&gt;.dt.year/month/day/hour <span class="hljs-comment"># Use pd.to_datetime(&lt;S&gt;) to get S of dates.</span>
</code></pre>
<pre><code class="python language-python hljs">&lt;S&gt;.plot.line/area/bar/pie/hist() <span class="hljs-comment"># Generates a plot. `plt.show()` displays it.</span>
</code></pre>
<ul>
<li><strong>Also: <code class="python hljs"><span class="hljs-string">'pd.cut(&lt;S&gt;, bins=&lt;int/coll&gt;)'</span></code> and <code class="python hljs"><span class="hljs-string">'&lt;S&gt;.quantile(&lt;float/coll&gt;)'</span></code>.</strong></li>
<li><strong>Also <code class="python hljs"><span class="hljs-string">'&lt;S&gt;.quantile(&lt;float/coll&gt;)'</span></code> and <code class="python hljs"><span class="hljs-string">'pd.cut(&lt;S&gt;, bins=&lt;int/coll&gt;)'</span></code>.</strong></li>
<li><strong>Indexing objects can't be tuples because <code class="python hljs"><span class="hljs-string">'obj[x, y]'</span></code> is converted to <code class="python hljs"><span class="hljs-string">'obj[(x, y)]'</span></code>.</strong></li>
<li><strong>Pandas uses NumPy types like <code class="python hljs"><span class="hljs-string">'np.int64'</span></code>. Series is converted to <code class="python hljs"><span class="hljs-string">'float64'</span></code> if we assign np.nan to any item. Use <code class="python hljs"><span class="hljs-string">'&lt;S&gt;.astype(&lt;str/type&gt;)'</span></code> to get converted Series.</strong></li>
<li><strong>Series will silently overflow if we run <code class="python hljs"><span class="hljs-string">'pd.Series([100], dtype="int8") + 100'</span></code>!</strong></li>
@ -2650,7 +2650,7 @@ b <span class="hljs-number">3</span> <span class="hljs-number">4</span>
&lt;DF&gt; = &lt;DF&gt;[&lt;S_of_bools&gt;] <span class="hljs-comment"># Filters rows. For example `df[df.x &gt; 1]`.</span>
&lt;DF&gt; = &lt;DF&gt;[&lt;DF_of_bools&gt;] <span class="hljs-comment"># Assigns NaN to items that are False in bools.</span>
</code></pre>
<pre><code class="python language-python hljs">&lt;DF&gt; = &lt;DF&gt; &gt; &lt;el/S/DF&gt; <span class="hljs-comment"># Returns DF of bools. S is treated as a row.</span>
<pre><code class="python language-python hljs">&lt;DF&gt; = &lt;DF&gt; &gt; &lt;el/S/DF&gt; <span class="hljs-comment"># Returns DF of bools. Treats series as a row.</span>
&lt;DF&gt; = &lt;DF&gt; + &lt;el/S/DF&gt; <span class="hljs-comment"># Items with non-matching keys get value NaN.</span>
</code></pre>
<pre><code class="python language-python hljs">&lt;DF&gt; = &lt;DF&gt;.set_index(col_key) <span class="hljs-comment"># Replaces row keys with column's values.</span>
@ -2719,7 +2719,7 @@ c <span class="hljs-number">6</span> <span class="hljs-number">7</span>
</code></pre>
<ul>
<li><strong>All methods operate on columns by default. Pass <code class="python hljs"><span class="hljs-string">'axis=1'</span></code> to process the rows instead.</strong></li>
<li><strong>Listed methods process the columns unless they receive <code class="python hljs"><span class="hljs-string">'axis=1'</span></code>. Exceptions to this rule are <code class="python hljs"><span class="hljs-string">'&lt;DF&gt;.dropna()'</span></code>, <code class="python hljs"><span class="hljs-string">'&lt;DF&gt;.drop(row_key/s)'</span></code> and <code class="python hljs"><span class="hljs-string">'&lt;DF&gt;.rename(&lt;dict/func&gt;)'</span></code>.</strong></li>
<li><strong>Fifth result's columns are indexed with a multi-index. This means we need a tuple of column keys to specify a column: <code class="python hljs"><span class="hljs-string">'&lt;DF&gt;.loc[row_key, (col_key_1, col_key_2)]'</span></code>.</strong></li>
</ul>
<div><h3 id="multiindex">Multi-Index</h3><pre><code class="python language-python hljs">&lt;DF&gt; = &lt;DF&gt;.loc[row_key_1] <span class="hljs-comment"># Or: &lt;DF&gt;.xs(row_key_1)</span>
@ -2931,7 +2931,7 @@ $ deactivate <span class="hljs-comment"># Deactivates the active
<footer>
<aside>February 4, 2025</aside>
<aside>February 5, 2025</aside>
<a href="https://gto76.github.io" rel="author">Jure Šorn</a>
</footer>

Loading…
Cancel
Save