Every experienced data scientist knows that they’ll spend more time cleaning data than building models. In fact, a study by Anaconda found that data scientists spend 45% of their time on data preparation tasks. That’s where my favourite Python one-liners come in. Let’s dive into three of my favourite Python one-liners for data cleaning.
Python One Liners for Data Cleaning
Below are my favourite Python one-liners for data cleaning. Think of these as the high-leverage tools in your data cleaning toolbox.
The NaN Whisperer for Efficiently Filling Missing Values
Most of the beginners iterate through a DataFrame or use multiple .fillna() calls for different columns. It’s slow, clunky, and prone to errors.
We can use a dictionary instead to fill missing values for multiple columns in a single, elegant line.
Here’s an example to fill NaNs based on a dictionary:
import pandas as pd
import numpy as np
# DataFrame with missing values
data = {'Age': [25, 29, np.nan, 45, 30],
'City': ['New York', 'London', 'Paris', np.nan, 'London'],
'Salary': [60000, 80000, 75000, 95000, np.nan]}
df = pd.DataFrame(data)
print(df)Age City Salary
0 25.0 New York 60000.0
1 29.0 London 80000.0
2 NaN Paris 75000.0
3 45.0 NaN 95000.0
4 30.0 London NaN
# The one-liner hack to fill NaNs based on a dictionary
df.fillna({'Age': df['Age'].mean(), 'City': df['City'].mode()[0], 'Salary': df['Salary'].median()}, inplace=True)
print(df)Age City Salary
0 25.00 New York 60000.0
1 29.00 London 80000.0
2 32.25 Paris 75000.0
3 45.00 London 95000.0
4 30.00 London 77500.0
Real-world data doesn’t just have one type of missing value. You might need to fill Age with the mean, but City with the mode. This one-liner handles it all in a single, readable command. It’s a declarative approach; you’re stating what you want to do, not how to do it.
The Column Renamer for Making Your Data Readable
One of the biggest mistakes made by beginners is manually renaming columns one by one. It is fine for fewer columns, but what if you have 50 columns?
We can use a list comprehension instead to normalize column names. This is ideal for cleaning up inconsistent names, such as those found in a database dump or a messy CSV file. Here’s an example:
# The wrong way
df.columns = ['Age (yrs)', 'City_of_residence', 'Salary $']
df.rename(columns={'Age (yrs)': 'age', 'City_of_residence': 'city_of_residence', 'Salary $': 'salary'}, inplace=True)
print(df)
# The one-liner hack to clean up all columns
df.columns = [col.lower().replace(' ', '_').replace('$', '').replace('.', '') for col in df.columns]
print(df)This is a powerful general-purpose hack for:
- Lowercasing all columns.
- Replacing spaces with underscores.
- Removing special characters ((), $, %, etc.).
The apply Powerhouse for Custom Transformations
Many beginners write a for loop to apply a custom function to a column. It’s often slow and not the pandas way.
We can use the .apply() method instead with a lambda function for quick, in-place transformations. Here’s an example:
# DataFrame with mixed data
df_mixed = pd.DataFrame({'OrderID': ['ABC1234', 'DEF5678', 'GHI9012'],
'Value': ['50.00 USD', '25.50 EUR', '100.00 USD']})
print(df_mixed)OrderID Value
0 ABC1234 50.00 USD
1 DEF5678 25.50 EUR
2 GHI9012 100.00 USD
# The one-liner hack to extract numeric value and convert to float df_mixed['Value_USD'] = df_mixed['Value'].apply(lambda x: float(x.split()[0])) print(df_mixed)
OrderID Value Value_USD
0 ABC1234 50.00 USD 50.0
1 DEF5678 25.50 EUR 25.5
2 GHI9012 100.00 USD 100.0
This single line extracts the numeric value, splits the string by space, takes the first element, and converts it to a float. It’s concise, readable, and lightning-fast.
Final Words
These one-liners aren’t just syntax tricks. They are the difference between a project that takes days and one that takes hours. Start incorporating them into your daily workflow. You’ll be amazed at how much time you save. I hope you liked this article on three of my favourite Python one-liners for data cleaning. Feel free to ask valuable questions in the comments section below. You can follow me on Instagram for many more resources.





