Table Of Contents

Search

Enter search terms or a module, class or function name.

API Reference

Input/Output

Pickling

Flat File

Clipboard

Excel

JSON

json_normalize(data[, record_path, meta, ...]) “Normalize” semi-structured JSON data into a flat table

HTML

HDFStore: PyTables (HDF5)

SAS

SQL

Google BigQuery

read_gbq(query[, project_id, index_col, ...]) Load data from Google BigQuery.
to_gbq(dataframe, destination_table, project_id) Write a DataFrame to a Google BigQuery table.

STATA

StataReader.data(**kwargs) DEPRECATED: Reads observations from Stata file, converting them into a dataframe
StataReader.data_label() Returns data label of Stata file
StataReader.value_labels() Returns a dict, associating each variable name a dict, associating
StataReader.variable_labels() Returns variable labels as a dict, associating each variable name
StataWriter.write_file()

General functions

Data manipulations

Top-level missing data

Top-level conversions

Top-level dealing with datetimelike

Top-level evaluation

Standard moving window functions

Standard expanding window functions

Exponentially-weighted moving window functions

Series

Constructor

Attributes

Axes
  • index: axis labels
Series.values Return Series as ndarray or ndarray-like
Series.dtype return the dtype object of the underlying data
Series.ftype return if the data is sparse|dense
Series.shape return a tuple of the shape of the underlying data
Series.nbytes return the number of bytes in the underlying data
Series.ndim return the number of dimensions of the underlying data, by definition 1
Series.size return the number of elements in the underlying data
Series.strides return the strides of the underlying data
Series.itemsize return the size of the dtype of the item of the underlying data
Series.base return the base object if the memory of the underlying data is shared
Series.T return the transpose, which is by definition self

Conversion

Indexing, iteration

Series.at Fast label-based scalar accessor
Series.iat Fast integer location scalar accessor.
Series.ix A primarily label-location based indexer, with integer position fallback.
Series.loc Purely label-location based indexer for selection by label.
Series.iloc Purely integer-location based indexing for selection by position.

For more information on .at, .iat, .ix, .loc, and .iloc, see the indexing documentation.

Binary operator functions

Function application, GroupBy

Computations / Descriptive Stats

Reindexing / Selection / Label manipulation

Missing data handling

Reshaping, sorting

Combining / joining / merging

Datetimelike Properties

Series.dt can be used to access the values of the series as datetimelike and return several properties. These can be accessed like Series.dt.<property>.

Datetime Properties

Series.dt.date Returns numpy array of datetime.date.
Series.dt.time Returns numpy array of datetime.time.
Series.dt.year The year of the datetime
Series.dt.month The month as January=1, December=12
Series.dt.day The days of the datetime
Series.dt.hour The hours of the datetime
Series.dt.minute The minutes of the datetime
Series.dt.second The seconds of the datetime
Series.dt.microsecond The microseconds of the datetime
Series.dt.nanosecond The nanoseconds of the datetime
Series.dt.week The week ordinal of the year
Series.dt.weekofyear The week ordinal of the year
Series.dt.dayofweek The day of the week with Monday=0, Sunday=6
Series.dt.weekday The day of the week with Monday=0, Sunday=6
Series.dt.dayofyear The ordinal day of the year
Series.dt.quarter The quarter of the date
Series.dt.is_month_start Logical indicating if first day of month (defined by frequency)
Series.dt.is_month_end Logical indicating if last day of month (defined by frequency)
Series.dt.is_quarter_start Logical indicating if first day of quarter (defined by frequency)
Series.dt.is_quarter_end Logical indicating if last day of quarter (defined by frequency)
Series.dt.is_year_start Logical indicating if first day of year (defined by frequency)
Series.dt.is_year_end Logical indicating if last day of year (defined by frequency)
Series.dt.daysinmonth The number of days in the month
Series.dt.days_in_month The number of days in the month
Series.dt.tz
Series.dt.freq get/set the frequncy of the Index

Datetime Methods

Timedelta Properties

Series.dt.days Number of days for each element.
Series.dt.seconds Number of seconds (>= 0 and less than 1 day) for each element.
Series.dt.microseconds Number of microseconds (>= 0 and less than 1 second) for each element.
Series.dt.nanoseconds Number of nanoseconds (>= 0 and less than 1 microsecond) for each element.
Series.dt.components Return a dataframe of the components (days, hours, minutes, seconds, milliseconds, microseconds, nanoseconds) of the Timedeltas.

Timedelta Methods

String handling

Series.str can be used to access the values of the series as strings and apply several methods to it. These can be acccessed like Series.str.<function/property>.

Categorical

If the Series is of dtype category, Series.cat can be used to change the the categorical data. This accessor is similar to the Series.dt or Series.str and has the following usable methods and properties:

Series.cat.categories The categories of this categorical.
Series.cat.ordered Gets the ordered attribute
Series.cat.codes

To create a Series of dtype category, use cat = s.astype("category").

The following two Categorical constructors are considered API but should only be used when adding ordering information or special categories is need at creation time of the categorical data:

np.asarray(categorical) works by implementing the array interface. Be aware, that this converts the Categorical back to a numpy array, so levels and order information is not preserved!

Plotting

Series.plot is both a callable method and a namespace attribute for specific plotting methods of the form Series.plot.<kind>.

Serialization / IO / Conversion

Sparse methods

DataFrame

Constructor

Attributes and underlying data

Axes

  • index: row labels
  • columns: column labels
DataFrame.dtypes Return the dtypes in this object
DataFrame.ftypes Return the ftypes (indication of sparse/dense and dtype) in this object.
DataFrame.values Numpy representation of NDFrame
DataFrame.axes Return a list with the row axis labels and column axis labels as the only members.
DataFrame.ndim Number of axes / array dimensions
DataFrame.size number of elements in the NDFrame
DataFrame.shape Return a tuple representing the dimensionality of the DataFrame.

Conversion

Indexing, iteration

DataFrame.at Fast label-based scalar accessor
DataFrame.iat Fast integer location scalar accessor.
DataFrame.ix A primarily label-location based indexer, with integer position fallback.
DataFrame.loc Purely label-location based indexer for selection by label.
DataFrame.iloc Purely integer-location based indexing for selection by position.

For more information on .at, .iat, .ix, .loc, and .iloc, see the indexing documentation.

Binary operator functions

Function application, GroupBy

Computations / Descriptive Stats

Reindexing / Selection / Label manipulation

Missing data handling

Reshaping, sorting, transposing

DataFrame.T Transpose index and columns

Combining / joining / merging

Time series-related

Plotting

DataFrame.plot is both a callable method and a namespace attribute for specific plotting methods of the form DataFrame.plot.<kind>.

Serialization / IO / Conversion

Panel

Constructor

Attributes and underlying data

Axes

  • items: axis 0; each item corresponds to a DataFrame contained inside
  • major_axis: axis 1; the index (rows) of each of the DataFrames
  • minor_axis: axis 2; the columns of each of the DataFrames
Panel.values Numpy representation of NDFrame
Panel.axes Return index label(s) of the internal NDFrame
Panel.ndim Number of axes / array dimensions
Panel.size number of elements in the NDFrame
Panel.shape Return a tuple of axis dimensions
Panel.dtypes Return the dtypes in this object
Panel.ftypes Return the ftypes (indication of sparse/dense and dtype) in this object.

Conversion

Getting and setting

Indexing, iteration, slicing

Panel.at Fast label-based scalar accessor
Panel.iat Fast integer location scalar accessor.
Panel.ix A primarily label-location based indexer, with integer position fallback.
Panel.loc Purely label-location based indexer for selection by label.
Panel.iloc Purely integer-location based indexing for selection by position.

For more information on .at, .iat, .ix, .loc, and .iloc, see the indexing documentation.

Binary operator functions

Function application, GroupBy

Computations / Descriptive Stats

Reindexing / Selection / Label manipulation

Missing data handling

Reshaping, sorting, transposing

Combining / joining / merging

Time series-related

Serialization / IO / Conversion

Panel4D

Constructor

Attributes and underlying data

Axes

  • labels: axis 1; each label corresponds to a Panel contained inside
  • items: axis 2; each item corresponds to a DataFrame contained inside
  • major_axis: axis 3; the index (rows) of each of the DataFrames
  • minor_axis: axis 4; the columns of each of the DataFrames
Panel4D.values Numpy representation of NDFrame
Panel4D.axes Return index label(s) of the internal NDFrame
Panel4D.ndim Number of axes / array dimensions
Panel4D.size number of elements in the NDFrame
Panel4D.shape Return a tuple of axis dimensions
Panel4D.dtypes Return the dtypes in this object
Panel4D.ftypes Return the ftypes (indication of sparse/dense and dtype) in this object.

Conversion

Index

Many of these methods or variants thereof are available on the objects that contain an index (Series/Dataframe) and those should most likely be used before calling these methods directly.

Attributes

Index.values return the underlying data as an ndarray
Index.is_monotonic alias for is_monotonic_increasing (deprecated)
Index.is_monotonic_increasing return if the index is monotonic increasing (only equal or
Index.is_monotonic_decreasing return if the index is monotonic decreasing (only equal or
Index.is_unique
Index.has_duplicates
Index.dtype
Index.inferred_type
Index.is_all_dates
Index.shape return a tuple of the shape of the underlying data
Index.nbytes return the number of bytes in the underlying data
Index.ndim return the number of dimensions of the underlying data, by definition 1
Index.size return the number of elements in the underlying data
Index.strides return the strides of the underlying data
Index.itemsize return the size of the dtype of the item of the underlying data
Index.base return the base object if the memory of the underlying data is shared
Index.T return the transpose, which is by definition self

Modifying and Computations

Conversion

Sorting

Time-specific operations

Combining / joining / set operations

Selecting

DatetimeIndex

Time/Date Components

DatetimeIndex.year The year of the datetime
DatetimeIndex.month The month as January=1, December=12
DatetimeIndex.day The days of the datetime
DatetimeIndex.hour The hours of the datetime
DatetimeIndex.minute The minutes of the datetime
DatetimeIndex.second The seconds of the datetime
DatetimeIndex.microsecond The microseconds of the datetime
DatetimeIndex.nanosecond The nanoseconds of the datetime
DatetimeIndex.date Returns numpy array of datetime.date.
DatetimeIndex.time Returns numpy array of datetime.time.
DatetimeIndex.dayofyear The ordinal day of the year
DatetimeIndex.weekofyear The week ordinal of the year
DatetimeIndex.week The week ordinal of the year
DatetimeIndex.dayofweek The day of the week with Monday=0, Sunday=6
DatetimeIndex.weekday The day of the week with Monday=0, Sunday=6
DatetimeIndex.quarter The quarter of the date
DatetimeIndex.tz
DatetimeIndex.freq get/set the frequncy of the Index
DatetimeIndex.freqstr return the frequency object as a string if its set, otherwise None
DatetimeIndex.is_month_start Logical indicating if first day of month (defined by frequency)
DatetimeIndex.is_month_end Logical indicating if last day of month (defined by frequency)
DatetimeIndex.is_quarter_start Logical indicating if first day of quarter (defined by frequency)
DatetimeIndex.is_quarter_end Logical indicating if last day of quarter (defined by frequency)
DatetimeIndex.is_year_start Logical indicating if first day of year (defined by frequency)
DatetimeIndex.is_year_end Logical indicating if last day of year (defined by frequency)
DatetimeIndex.inferred_freq

Selecting

Time-specific operations

Conversion

TimedeltaIndex

Components

TimedeltaIndex.days Number of days for each element.
TimedeltaIndex.seconds Number of seconds (>= 0 and less than 1 day) for each element.
TimedeltaIndex.microseconds Number of microseconds (>= 0 and less than 1 second) for each element.
TimedeltaIndex.nanoseconds Number of nanoseconds (>= 0 and less than 1 microsecond) for each element.
TimedeltaIndex.components Return a dataframe of the components (days, hours, minutes, seconds, milliseconds, microseconds, nanoseconds) of the Timedeltas.
TimedeltaIndex.inferred_freq

Conversion

GroupBy

GroupBy objects are returned by groupby calls: pandas.DataFrame.groupby(), pandas.Series.groupby(), etc.

Indexing, iteration

GroupBy.__iter__() Groupby iterator
GroupBy.groups dict {group name -> group labels}
GroupBy.indices dict {group name -> group indices}
GroupBy.get_group(name[, obj]) Constructs NDFrame from group with provided name

Function application

GroupBy.apply(func, *args, **kwargs) Apply function and combine results together in an intelligent way.
GroupBy.aggregate(func, *args, **kwargs)
GroupBy.transform(func, *args, **kwargs)

Computations / Descriptive Stats

GroupBy.count() Compute count of group, excluding missing values
GroupBy.cumcount([ascending]) Number each item in each group from 0 to the length of that group - 1.
GroupBy.first() Compute first of group values
GroupBy.head([n]) Returns first n rows of each group.
GroupBy.last() Compute last of group values
GroupBy.max() Compute max of group values
GroupBy.mean() Compute mean of groups, excluding missing values
GroupBy.median() Compute median of groups, excluding missing values
GroupBy.min() Compute min of group values
GroupBy.nth(n[, dropna]) Take the nth row from each group if n is an int, or a subset of rows if n is a list of ints.
GroupBy.ohlc() Compute sum of values, excluding missing values
GroupBy.prod() Compute prod of group values
GroupBy.size() Compute group sizes
GroupBy.sem([ddof]) Compute standard error of the mean of groups, excluding missing values
GroupBy.std([ddof]) Compute standard deviation of groups, excluding missing values
GroupBy.sum() Compute sum of group values
GroupBy.var([ddof]) Compute variance of groups, excluding missing values
GroupBy.tail([n]) Returns last n rows of each group

The following methods are available in both SeriesGroupBy and DataFrameGroupBy objects, but may differ slightly, usually in that the DataFrameGroupBy version usually permits the specification of an axis argument, and often an argument indicating whether to restrict application to columns of a specific data type.

DataFrameGroupBy.bfill([axis, inplace, ...]) Synonym for NDFrame.fillna(method=’bfill’)
DataFrameGroupBy.cummax([axis, dtype, out, ...]) Return cumulative max over requested axis.
DataFrameGroupBy.cummin([axis, dtype, out, ...]) Return cumulative min over requested axis.
DataFrameGroupBy.cumprod([axis]) Cumulative product for each group
DataFrameGroupBy.cumsum([axis]) Cumulative sum for each group
DataFrameGroupBy.describe([percentiles, ...]) Generate various summary statistics, excluding NaN values.
DataFrameGroupBy.all([axis, bool_only, ...]) Return whether all elements are True over requested axis
DataFrameGroupBy.any([axis, bool_only, ...]) Return whether any element is True over requested axis
DataFrameGroupBy.corr([method, min_periods]) Compute pairwise correlation of columns, excluding NA/null values
DataFrameGroupBy.cov([min_periods]) Compute pairwise covariance of columns, excluding NA/null values
DataFrameGroupBy.diff([periods, axis]) 1st discrete difference of object
DataFrameGroupBy.ffill([axis, inplace, ...]) Synonym for NDFrame.fillna(method=’ffill’)
DataFrameGroupBy.fillna([value, method, ...]) Fill NA/NaN values using the specified method
DataFrameGroupBy.hist(data[, column, by, ...]) Draw histogram of the DataFrame’s series using matplotlib / pylab.
DataFrameGroupBy.idxmax([axis, skipna]) Return index of first occurrence of maximum over requested axis.
DataFrameGroupBy.idxmin([axis, skipna]) Return index of first occurrence of minimum over requested axis.
DataFrameGroupBy.mad([axis, skipna, level]) Return the mean absolute deviation of the values for the requested axis
DataFrameGroupBy.pct_change([periods, ...]) Percent change over given number of periods.
DataFrameGroupBy.plot Class implementing the .plot attribute for groupby objects
DataFrameGroupBy.quantile([q, axis, ...]) Return values at the given quantile over requested axis, a la numpy.percentile.
DataFrameGroupBy.rank([axis, numeric_only, ...]) Compute numerical data ranks (1 through n) along axis.
DataFrameGroupBy.resample(rule[, how, axis, ...]) Convenience method for frequency conversion and resampling of regular time-series data.
DataFrameGroupBy.shift([periods, freq, axis]) Shift each group by periods observations
DataFrameGroupBy.skew([axis, skipna, level, ...]) Return unbiased skew over requested axis
DataFrameGroupBy.take(indices[, axis, ...]) Analogous to ndarray.take
DataFrameGroupBy.tshift([periods, freq, axis]) Shift the time index, using the index’s frequency if available

The following methods are available only for SeriesGroupBy objects.

SeriesGroupBy.nlargest(*args, **kwargs) Return the largest n elements.
SeriesGroupBy.nsmallest(*args, **kwargs) Return the smallest n elements.
SeriesGroupBy.nunique([dropna])
SeriesGroupBy.unique() Return array of unique values in the object.
SeriesGroupBy.value_counts([normalize, ...])

The following methods are available only for DataFrameGroupBy objects.

DataFrameGroupBy.corrwith(other[, axis, drop]) Compute pairwise correlation between rows or columns of two DataFrame objects.

Style

Styler objects are returned by pandas.DataFrame.style.

Constructor

Styler(data[, precision, table_styles, ...]) Helps style a DataFrame or Series according to the data with HTML and CSS.

Style Application

Styler.apply(func[, axis, subset]) Apply a function column-wise, row-wise, or table-wase, updating the HTML representation with the result.
Styler.applymap(func[, subset]) Apply a function elementwise, updating the HTML representation with the result.
Styler.set_precision(precision) Set the precision used to render.
Styler.set_table_styles(table_styles) Set the table styles on a Styler
Styler.set_caption(caption) Se the caption on a Styler
Styler.set_properties([subset]) Convience method for setting one or more non-data dependent properties or each cell.
Styler.set_uuid(uuid) Set the uuid for a Styler.
Styler.clear() “Reset” the styler, removing any previously applied styles.

Builtin Styles

Styler.highlight_max([subset, color, axis]) Highlight the maximum by shading the background
Styler.highlight_min([subset, color, axis]) Highlight the minimum by shading the background
Styler.highlight_null([null_color]) Shade the background null_color for missing values.
Styler.background_gradient([cmap, low, ...]) Color the background in a gradient according to the data in each column (optionally row).
Styler.bar([subset, axis, color, width]) Color the background color proptional to the values in each column.

Style Export and Import

Styler.render() Render the built up styles to HTML
Styler.export() Export the styles to applied to the current Styler.
Styler.use(styles) Set the styles on the current Styler, possibly using styles from Styler.export.

General utility functions

Working with options