Hereâs a list of common methods in pandas that you can use for various data manipulation tasks. I've categorized them based on functionality:
Basic Methods:
-
pd.read_csv()
â Load CSV files into a DataFrame. -
df.head()
â View the first 5 rows of a DataFrame. -
df.tail()
â View the last 5 rows of a DataFrame. -
df.info()
â Summary of DataFrame with types, non-null counts, etc. -
df.describe()
â Get a statistical summary of the DataFrame. -
df.shape
â Get the number of rows and columns. -
df.columns
â Get the column labels of a DataFrame. -
df.dtypes
â Check the data types of columns. -
df.set_index()
â Set a column as the index. -
df.reset_index()
â Reset the index to default.
Accessing and Selecting Data:
-
df.loc[]
â Access rows and columns by labels (slicing). -
df.iloc[]
â Access rows and columns by index (integer-based). -
df.at[]
â Access a single value for a row/column pair. -
df.iat[]
â Access a single value for a specific row/column. -
df['column_name']
â Access a single column. -
df[['col1', 'col2']]
â Access multiple columns. -
df.iloc[condition]
â Access rows based on conditions (e.g.,df.iloc[df['col'] > 10]
).
Manipulating Data:
-
df.drop()
â Drop a row/column from the DataFrame. -
df.dropna()
â Drop missing values. -
df.fillna()
â Fill missing values with a specific value or method. -
df.rename()
â Rename columns or index labels. -
df.assign()
â Assign new columns to the DataFrame. -
df.insert()
â Insert a new column at a specific position. -
df.append()
â Append rows to a DataFrame. -
df.concat()
â Concatenate multiple DataFrames along rows or columns. -
df.merge()
â Merge two DataFrames based on a common column. -
df.join()
â Join two DataFrames based on the index. -
df.replace()
â Replace values in a DataFrame. -
df.apply()
â Apply a function along an axis (rows/columns).
Filtering and Querying:
-
df.query()
â Query the DataFrame using a string expression. -
df[condition]
â Filter rows based on conditions (e.g.,df[df['age'] > 30]
). -
df.str.contains()
â Filter strings containing a pattern. -
df.str.match()
â Filter strings matching a regex pattern. -
df.isna()
â Detect missing values. -
df.notna()
â Detect non-missing values. -
df.isnull()
â Check for null values. -
df.notnull()
â Check for non-null values.
Aggregation and Transformation:
-
df.groupby()
â Group data based on columns and apply aggregation. -
df.aggregate()
â Apply aggregate functions to data. -
df.mean()
â Calculate the mean of a column. -
df.sum()
â Calculate the sum of a column. -
df.min()
â Find the minimum value in a column. -
df.max()
â Find the maximum value in a column. -
df.count()
â Count non-null values in a column. -
df.cumsum()
â Compute the cumulative sum. -
df.cumprod()
â Compute the cumulative product. -
df.transform()
â Apply a function elementwise to each group.
Date/Time Operations:
-
pd.to_datetime()
â Convert to datetime type. -
df['date'].dt.year
â Extract year from a datetime column. -
df['date'].dt.month
â Extract month from a datetime column. -
df['date'].dt.day
â Extract day from a datetime column. -
df['date'].dt.weekday
â Extract weekday (0 = Monday, 6 = Sunday). -
df['date'].dt.date
â Extract date (without time).
Sorting and Ranking:
-
df.sort_values()
â Sort DataFrame by values in one or more columns. -
df.sort_index()
â Sort DataFrame by index. -
df.rank()
â Rank the values in a column.
Pivoting and Reshaping Data:
-
df.pivot()
â Pivot the DataFrame (reshaping data). -
df.pivot_table()
â Create a pivot table. -
df.melt()
â Unpivot the DataFrame from wide to long format. -
df.stack()
â Stack the columns into rows (reshape). -
df.unstack()
â Unstack rows into columns.
Data Type Conversion:
-
df.astype()
â Convert data types of columns. -
df.to_numeric()
â Convert a column to numeric values. -
df.to_datetime()
â Convert a column to datetime.
Rolling and Window Functions:
-
df.rolling()
â Create a rolling view of a DataFrame for window-based operations. -
df.rolling(window=5).mean()
â Calculate the rolling mean over a window of size 5. -
df.expanding()
â Apply expanding window operations (i.e., cumulative).
Other Useful Methods:
-
df.pivot_table()
â Create a pivot table for summarizing data. -
df.to_csv()
â Save DataFrame to a CSV file. -
df.to_excel()
â Save DataFrame to an Excel file. -
df.to_json()
â Save DataFrame to a JSON file. -
df.memory_usage()
â Get memory usage of DataFrame columns.
Lambda Functions:
-
df.apply(lambda x: x * 2)
â Apply a lambda function elementwise on the DataFrame.
This list should give you a wide variety of methods to handle different tasks in pandas.