Welcome!

By registering with us, you'll be able to discuss, share and private message with other members of our community.

SignUp Now!

Column wise count in Pandas DF

Member
Joined
Feb 3, 2023
Messages
131
I have the below DF
ABCDCOLUMNS
1ABCABC2,3
DEFABCDEF2,4
LSDLSDLSD2,4
ERRERR2,3
ERRABCLSD2,4
Expected Output
A-1B-2C-3D-4F-5COLUMNS INDEXABCDEFERRLSD
1ABCABC2,32000
DEFABCDEFDEF2,41200
LSDLSDLSDLSD2,30002
ERRERR2,30020
ERRABCLSDLSD2,41011
I HAVE CREATED THE COULUMN INDEX FOR SUMMING BUT I COULDN'T ABLE TO LOOP EACH COLUMNS INDED AND GET THE INTENTED OUTPUT.
 
New member
Joined
Feb 6, 2023
Messages
6
Use DataFrame.melt with crosstab and DataFrame.join:

Code:
a = df[['B','C','D']].melt(ignore_index=False)['value']

df1 = df.join(pd.crosstab(a.index, a))

df1.insert(4, 'F', df1['D'])
d = {v: f'{v}-{k}' for k, v in enumerate(df1.columns[:5], 1)}
df1 = df1.rename(columns=d)
print (df1)
   A-1  B-2  C-3  D-4  F-5 COLUMNS  ABC  DEF  ERR  LSD
0  1.0  ABC  ABC  NaN  NaN     2,3    2    0    0    0
1  NaN  DEF  ABC  DEF  DEF     2,4    1    2    0    0
2  NaN  LSD  LSD  LSD  LSD     2,4    0    0    0    3
3  NaN  ERR  ERR  NaN  NaN     2,3    0    0    2    0
4  NaN  ERR  ABC  LSD  LSD     2,4    1    0    1    1
If need slice by COLUMN:

Code:
from collections import Counter

L = [Counter(b[a[0]-1:a[1]]) for a, b in zip(df['COLUMNS'].str.split(',', expand=True)
                                                        .astype(int).to_numpy(),
                                             df.to_numpy())]

df = df.join(pd.DataFrame(L, index=df.index).fillna(0).astype(int))
print (df)
     A    B    C    D COLUMNS  ABC  DEF  LSD  ERR
0  1.0  ABC  ABC  NaN     2,3    2    0    0    0
1  NaN  DEF  ABC  DEF     2,4    1    2    0    0
2  NaN  LSD  LSD  LSD     2,4    0    0    3    0
3  NaN  ERR  ERR  NaN     2,3    0    0    0    2
4  NaN  ERR  ABC  LSD     2,4    1    0    1    1
 
Top