diff --git a/README.md b/README.md index 114b455..8d3d95d 100644 --- a/README.md +++ b/README.md @@ -3345,27 +3345,27 @@ c 6 7 ```python = .xs(key, level=) # Rows with key on passed level of multi-index. = .xs(keys, level=, axis=1) # Cols that have first key on first level, etc. - = .set_index(col_keys) # Combines multiple columns into a multi-index. + = .set_index(col_keys) # Creates index from cols. Also `append=False`. = .stack/unstack(level=-1) # Combines col keys with row keys or vice versa. = .pivot_table(index=col_key/s) # `columns=key/s, values=key/s, aggfunc='mean'`. ``` -#### DataFrame — Encode, Decode: +### File Formats ```python - = pd.read_json/pickle() # Also accepts io.StringIO/BytesIO(). - = pd.read_csv() # `header/index_col/dtype/usecols/…=`. - = pd.read_excel() # `sheet_name=None` returns dict of all sheets. - = pd.read_sql('', ) # SQLite3/SQLAlchemy connection (see #SQLite). - = pd.read_html() # Run `$ pip3 install beautifulsoup4 lxml`. + = pd.read_json/pickle() # Also accepts io.StringIO/BytesIO(). + = pd.read_csv/excel() # Also `header/index_col/dtype/usecols/…=`. + = pd.read_html() # Raises ImportError if webpage has zero tables. + = pd.read_parquet/feather/hdf() # Read_hdf() accepts `key=''` argument. + = pd.read_sql('
', ) # Pass SQLite3/Alchemy connection (see #SQLite). ``` ```python - = .to_dict('d/l/s/…') # Returns columns as dicts, lists or series. - = .to_json/csv/html/latex() # Saves output to a file if path is passed. -.to_pickle/excel() # Run `$ pip3 install "pandas[excel]" odfpy`. +.to_json/csv/html/parquet/latex() # Returns a string/bytes if path is omitted. +.to_pickle/excel/feather/hdf() # To_hdf() requires `key=''` argument. .to_sql('', ) # Also `if_exists='fail/replace/append'`. ``` -* **Read\_csv() only parses dates of columns that were specified by 'parse\_dates' argument. It automatically tries to detect the format, but it can be helped with 'date\_format' or 'datefirst' arguments. Both dates and datetimes get stored as pd.Timestamp objects.** +* **`'$ pip3 install "pandas[excel]" odfpy lxml pyarrow'` installs dependencies.** +* **Read\_csv() only parses dates of columns that were specified by 'parse\_dates' argument. It automatically tries to detect the format, but it can be helped with 'date\_format' or 'dayfirst' arguments. Both dates and datetimes get stored as pd.Timestamp objects.** * **If there's a single invalid date then it returns the whole column as a series of strings, unlike `' = pd.to_datetime(, errors="coerce")'`, which uses pd.NaT.** * **To get specific attributes from a series of Timestamps use `'.dt.year/date/…'`.** diff --git a/index.html b/index.html index e9492d1..eeb2d0e 100644 --- a/index.html +++ b/index.html @@ -55,7 +55,7 @@
- +
@@ -2724,25 +2724,25 @@ c 6 7

DataFrame — Multi-Index:

<DF>   = <DF>.xs(key, level=<int>)             # Rows with key on passed level of multi-index.
 <DF>   = <DF>.xs(keys, level=<ints>, axis=1)   # Cols that have first key on first level, etc.
-<DF>   = <DF>.set_index(col_keys)              # Combines multiple columns into a multi-index.
+<DF>   = <DF>.set_index(col_keys)              # Creates index from cols. Also `append=False`.
 <S/DF> = <DF>.stack/unstack(level=-1)          # Combines col keys with row keys or vice versa.
 <DF>   = <DF>.pivot_table(index=col_key/s)     # `columns=key/s, values=key/s, aggfunc='mean'`.
 
-

DataFrame — Encode, Decode:

<DF>   = pd.read_json/pickle(<path/url/file>)  # Also accepts io.StringIO/BytesIO(<str/bytes>).
-<DF>   = pd.read_csv(<path/url/file>)          # `header/index_col/dtype/usecols/…=<obj>`.
-<DF>   = pd.read_excel(<path/url/file>)        # `sheet_name=None` returns dict of all sheets.
-<DF>   = pd.read_sql('<table/query>', <conn>)  # SQLite3/SQLAlchemy connection (see #SQLite).
-<list> = pd.read_html(<path/url/file>)         # Run `$ pip3 install beautifulsoup4 lxml`.
+

File Formats

<S/DF> = pd.read_json/pickle(<path/url/file>)  # Also accepts io.StringIO/BytesIO(<str/bytes>).
+<DF>   = pd.read_csv/excel(<path/url/file>)    # Also `header/index_col/dtype/usecols/…=<obj>`.
+<list> = pd.read_html(<path/url/file>)         # Raises ImportError if webpage has zero tables.
+<S/DF> = pd.read_parquet/feather/hdf(<path…>)  # Read_hdf() accepts `key='<df_name>'` argument.
+<DF>   = pd.read_sql('<table/query>', <conn>)  # Pass SQLite3/Alchemy connection (see #SQLite).
 
-
<dict> = <DF>.to_dict('d/l/s/…')               # Returns columns as dicts, lists or series.
-<str>  = <DF>.to_json/csv/html/latex()         # Saves output to a file if path is passed.
-<DF>.to_pickle/excel(<path>)                   # Run `$ pip3 install "pandas[excel]" odfpy`.
+
<DF>.to_json/csv/html/parquet/latex(<path>)    # Returns a string/bytes if path is omitted.
+<DF>.to_pickle/excel/feather/hdf(<path>)       # To_hdf() requires `key='<df_name>'` argument.
 <DF>.to_sql('<table_name>', <connection>)      # Also `if_exists='fail/replace/append'`.
 
    -
  • Read_csv() only parses dates of columns that were specified by 'parse_dates' argument. It automatically tries to detect the format, but it can be helped with 'date_format' or 'datefirst' arguments. Both dates and datetimes get stored as pd.Timestamp objects.
  • +
  • '$ pip3 install "pandas[excel]" odfpy lxml pyarrow' installs dependencies.
  • +
  • Read_csv() only parses dates of columns that were specified by 'parse_dates' argument. It automatically tries to detect the format, but it can be helped with 'date_format' or 'dayfirst' arguments. Both dates and datetimes get stored as pd.Timestamp objects.
  • If there's a single invalid date then it returns the whole column as a series of strings, unlike '<S> = pd.to_datetime(<S>, errors="coerce")', which uses pd.NaT.
  • To get specific attributes from a series of Timestamps use '<S>.dt.year/date/…'.
@@ -2934,7 +2934,7 @@ $ deactivate # Deactivates the active
- +