add columns different length pandas

I have a problem with adding columns in pandas.
I have DataFrame, dimensional is nxk. And in process I wiil need add columns with dimensional mx1, where m = [1,n], but I don’t know m.

When I try do it:

df['Name column'] = data    
# type(data) = list

result:

AssertionError: Length of values does not match length of index   

Can I add columns with different length?

Here is Solutions:

We have many solutions to this problem, But we recommend you to use the first solution because it is tested & true solution that will 100% work for you.

Solution 1

If you use accepted answer, you’ll lose your column names, as shown in the accepted answer example, and described in the documentation (emphasis added):

The resulting axis will be labeled 0, …, n – 1. This is useful if you are concatenating objects where the concatenation axis does not have meaningful indexing information.

It looks like column names ('Name column') are meaningful to the Original Poster / Original Question.

To save column names, use pandas.concat, but don’t ignore_index (default value of ignore_index is false; so you can omit that argument altogether). Continue to use axis=1:

import pandas

# Note these columns have 3 rows of values:
original = pandas.DataFrame({
    'Age':[10, 12, 13], 
    'Gender':['M','F','F']
})

# Note this column has 4 rows of values:
additional = pandas.DataFrame({
    'Name': ['Nate A', 'Jessie A', 'Daniel H', 'John D']
})

new = pandas.concat([original, additional], axis=1) 
# Identical:
# new = pandas.concat([original, additional], ignore_index=False, axis=1) 

print(new.head())

#          Age        Gender        Name
#0          10             M      Nate A
#1          12             F    Jessie A
#2          13             F    Daniel H
#3         NaN           NaN      John D

Notice how John D does not have an Age or a Gender.

Solution 2

Use concat and pass axis=1 and ignore_index=True:

In [38]:

import numpy as np
df = pd.DataFrame({'a':np.arange(5)})
df1 = pd.DataFrame({'b':np.arange(4)})
print(df1)
df
   b
0  0
1  1
2  2
3  3
Out[38]:
   a
0  0
1  1
2  2
3  3
4  4
In [39]:

pd.concat([df,df1], ignore_index=True, axis=1)
Out[39]:
   0   1
0  0   0
1  1   1
2  2   2
3  3   3
4  4 NaN

Solution 3

We can add the different size of list values to DataFrame.

Example

a = [0,1,2,3]
b = [0,1,2,3,4,5,6,7,8,9]
c = [0,1]

Find the Length of all list

la,lb,lc = len(a),len(b),len(c)
# now find the max
max_len = max(la,lb,lc)

Resize all according to the determined max length (not in this example

if not max_len == la:
  a.extend(['']*(max_len-la))
if not max_len == lb:
  b.extend(['']*(max_len-lb))
if not max_len == lc:
  c.extend(['']*(max_len-lc))

Now the all list is same length and create dataframe

pd.DataFrame({'A':a,'B':b,'C':c}) 

Final Output is

   A  B  C
0  1  0  1
1  2  1   
2  3  2   
3     3   
4     4   
5     5   
6     6   
7     7   
8     8   
9     9  

Solution 4

I had the same issue, two different dataframes and without a common column. I just needed to put them beside each other in a csv file.

  • Merge:
    In this case, “merge” does not work; even adding a temporary column to both dfs and then dropping it. Because this method makes both dfs with the same length. Hence, it repeats the rows of the shorter dataframe to match the longer dataframe’s length.
  • Concat:
    The idea of The Red Pea didn’t work for me. It just appended the shorter df to the longer one (row-wise) while leaving an empty column (NaNs) above the shorter df’s column.
  • Solution: You need to do the following:
df1 = df1.reset_index()
df2 = df2.reset_index()
df = [df1, df2]
df_final = pd.concat(df, axis=1)

df_final.to_csv(filename, index=False)

This way, you’ll see your dfs besides each other (column-wise), each of which with its own length.

Solution 5

If somebody like to replace a specific column of a different size instead of adding it.

Based on this answer, I use a dict as an intermediate type.
Create Pandas Dataframe with different sized columns

If the column to be inserted is not a list but already a dict, the respective line can be omitted.

def fill_column(dataframe: pd.DataFrame, list: list, column: str):
    dict_from_list = dict(enumerate(list)) # create enumertable object from list and create dict

    dataFrame_asDict = dataframe.to_dict() # Get DataFrame as Dict
    dataFrame_asDict
= dict_from_list # Assign specific column return pd.DataFrame.from_dict(dataFrame_asDict, orient='index').T # Create new DataSheet from Dict and return it

Note: Use and implement solution 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply