
> Is there a function that is like groupby(). I could use apply and then create df by using pd.match, but that would necessitate matching over sometimes multiple groupby columns (col1 and col2) which seems really hacky / would take a fair amount of code.

What is the next best alternative in terms of speed / elegance? e.g. The abstract definition of grouping is to provide a mapping of labels to group names. TypeError: cannot concatenate a non-NDFrame objectīut transform apparently isn't able to combine multiple columns together because it looks at each column separately (unlike apply). pandas objects can be split on any of their axes. The below example does the grouping on Courses column and calculates count how many times each value is present. It works with non-floating type data as well. # works with apply, but I want transform:ĭf.groupby()].apply(foo_function)ĭf.groupby()].transform(foo_function) Use pandas oupby () to group the rows by column and use count () method to get the count for each group by ignoring None and Nan values.

Here's an example dataframe: foo_function = lambda x: np.sum(x.a+x.b) Therefore, it allows us to conduct operations. The apply () method accepts the argument as a data frame and returns a scalar or a sequence of the data frame. Although Groupby is much faster than Pandas GroupBy.apply and ansform with user-defined functions, Pandas is much faster with common functions like mean and sum because they are implemented in Cython.
PANDAS GROUPBY TRANSFORM UPDATE
The difference between these two methods is the argument passed, and the value returned. Update 9/30/17: Code for a faster version of Groupby is available here as part of the hdfe package.

I have a big dataframe, and I'm grouping by one to n columns, and want to apply a function on these groups across two columns (e.g. The apply () and transform () are two methods used in conjunction with the groupby () method call.
