Search code snippets, questions, articles...

Merge two or multiple DataFrames in Pandas

In this post, we are showing the techniques and methods that can be used to concatenate two or multiple DataFrames into one.
import pandas as pd

# DataFrame 1
df1 = pd.DataFrame({'id': ['1000', '1001', '1002', '1003'],
                    'company': ['Devsheet', 'SpaceX', 'Tesla', 'Open AI']})
  
# DataFrame 2
df2 = pd.DataFrame({'location': ['India', 'US', 'US', 'US'],
                    'product': ['Website', 'Rocket', 'Car', 'ML']})
  
result = pd.concat([df1, df2], axis=1)

print(result)
Best JSON Validator, JSON Tree Viewer, JSON Beautifier at same place. Check how cool is the tool

Output

     id   company location  product
0  1000  Devsheet    India  Website
1  1001    SpaceX       US   Rocket
2  1002     Tesla       US      Car
3  1003   Open AI       US       ML

We are joining two DataFrames df1 and df2 here using the pandas.concat() function. We are making a list from the DataFrames df1 and df2 and passing the list to concat() function.

We are merging DataFrames by columns means adding new columns that are new in the second DataFrame.

Merge DataFrames by Columns

To merge DataFrame by columns we pass axis=1 in the concat() function. This will add new columns from the first and second DataFrame to the new DataFrame.

import pandas as pd

# DataFrame 1
df1 = pd.DataFrame({'company': ['Devsheet', 'SpaceX', 'Tesla', 'Open AI']})
  
# DataFrame 2
df2 = pd.DataFrame({'location': ['India', 'US', 'US', 'US']})
  
result = pd.concat([df1, df2], axis=1)

print(result)

Output

    company location
0  Devsheet    India
1    SpaceX       US
2     Tesla       US
3   Open AI       US

Merge DataFrames by rows

You can merge DataFrames by rows by joining rows of Columns with the same names. Below is an example that can be helpful to understand.

import pandas as pd

# DataFrame 1
df1 = pd.DataFrame({
  'id': [100, 101, 102, 103],
  'company': ['Devsheet', 'SpaceX', 'Tesla', 'Open AI'],
})
  
# DataFrame 2
df2 = pd.DataFrame({
  'id': [104, 105, 106],
  'company': ['Microsoft', 'Apple', 'Amazon']
})
  
result = pd.concat([df1, df2])

print(result)

Output

    id    company
0  100   Devsheet
1  101     SpaceX
2  102      Tesla
3  103    Open AI
0  104  Microsoft
1  105      Apple
2  106     Amazon
Was this helpful?
0 Comments