Pandas DataFrame set value for multiple rows

Set value for multiple rows in Pandas DataFrame

Being able to set or update the values in multiple rows within a DataFrame is useful when undertaking feature engineering or data cleaning. In this post I will show the various ways you can do this with some simple examples.

Stephen Allwright
Stephen Allwright

Being able to set or update the values in multiple rows within a DataFrame is useful when undertaking feature engineering or data cleaning. In this post I will show the various ways you can do this with some simple examples.

Pandas DataFrame set value for multiple rows

Setting a value for multiple rows in a DataFrame can be done in several ways, but the most common method is to set the new value based on a condition by doing the following: df.loc[df['column1'] >= 100, 'column2'] = 10

Set value for multiple rows based on a condition in Pandas

In this example we are changing values in the Score column based on a condition in the Age column.

import pandas as pd

df = pd.DataFrame(
    [
        ["Stephen", 30, 8],
        ["Olga", 65, 5],
        ["David", 25, 9],
        ["Jane", 42, 2],
        ["Manny", 51, 3],
        ["Sigrid", 18, 6],
    ],
    columns=["Name", "Age", "Score"],
)

print(df)

"""
Output:
Name  Age  Score
0  Stephen   30      8
1     Olga   65      5
2    David   25      9
3     Jane   42      2
4    Manny   51      3
5   Sigrid   18      6
"""

df.loc[df['Age'] >= 50, 'Score'] = 10

print(df)

"""
Output:
Name  Age  Score
0  Stephen   30      8
1     Olga   65     10
2    David   25      9
3     Jane   42      2
4    Manny   51     10
5   Sigrid   18      6
"""

Set value for multiple rows based on index values in Pandas

If you don’t want to change a value based on a condition, but instead change a set of rows based on their index values then there are several ways to do this.

Using .at Pandas method

This method allows you to set a value for a given slice of rows and list of column names.

df.at[:3, ["Age", "Score"]] = 100

"""
Output:
Name  Age  Score
0  Stephen  100    100
1     Olga  100    100
2    David  100    100
3     Jane  100    100
4    Manny   51      3
5   Sigrid   18      6
"""

Using .iloc Pandas method

If you want to set the value for a slice of rows but don’t want to write the column names in plain text then we can use the .iloc method which selects columns based on their index values.

df.iloc[:3, [1, 2]] = 100

"""
Output:
Name  Age  Score
0  Stephen  100    100
1     Olga  100    100
2    David  100    100
3     Jane   42      2
4    Manny   51      3
5   Sigrid   18      6
"""

One difference to note between using these two methods is that .loc uses exclusive indexing whilst .at uses inclusive indexing, which is why they update different rows with the same index slice values.

Set value for multiple rows by replacing all occurrences in Pandas

If you want to replace all occurrences of a value regardless of where it is in the DataFrame then using the .replace method is the best approach.

df.replace(9, 100, inplace=True)

"""
Output:
Name  Age  Score
0  Stephen   30      8
1     Olga   65      5
2    David   25    100
3     Jane   42      2
4    Manny   51      3
5   Sigrid   18      6
"""

If you would like to learn more about selection methods in Pandas then here are some articles that should interest you:

Pandas loc vs iloc

References

Pandas replace documentation
Pandas at documentation
Pandas iloc documentation
Pandas loc documentation

Pandas

Stephen Allwright Twitter

I'm a Data Scientist currently working for Oda, an online grocery retailer, in Oslo, Norway. These posts are my way of sharing some of the tips and tricks I've picked up along the way.