-
Notifications
You must be signed in to change notification settings - Fork 359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement the missing Series.swaplevel #1919
Implement the missing Series.swaplevel #1919
Conversation
FYI: You have to remove the API from "missing" files and add a link in the doc. |
You can find the missing files in |
Codecov Report
@@ Coverage Diff @@
## master #1919 +/- ##
==========================================
+ Coverage 94.08% 94.16% +0.08%
==========================================
Files 41 41
Lines 10019 10022 +3
==========================================
+ Hits 9426 9437 +11
+ Misses 593 585 -8
Continue to review full report at Codecov.
|
databricks/koalas/series.py
Outdated
index_map = list(zip(self._internal.index_spark_column_names, self._internal.index_names)) | ||
index_map[i], index_map[j], = index_map[j], index_map[i] | ||
index_spark_column_names, index_names = zip(*index_map) | ||
internal = self._kdf._internal.copy( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self._internal
instead ofself._kdf._internal
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May I ask what's their difference at usage?
I see return self._kdf._internal.select_column(self._column_label)
is for self._internal
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self._kdf._internal
contains other columns from the anchor DataFrame.
E.g.,:
>>> midx = pd.MultiIndex.from_arrays([['a', 'b'], [1, 2]], names = ['word', 'number'])
>>> kdf = ks.DataFrame({'a': [1, 2], 'b': ['x', 'y']}, index=midx)
>>> kdf
a b
word number
a 1 1 x
b 2 2 y
>>> kdf.b.swaplevel()
number word
1 a 1
2 b 2
Name: a, dtype: int64
should be:
>>> kdf.b.swaplevel()
number word
1 a x
2 b y
Name: b, dtype: object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explaining!
I'm a little confused by "contains". Their spark_frame
seems to be identical.
>>> kdf.b._internal.spark_frame.show()
+-----------------+-----------------+---+---+-----------------+
|__index_level_0__|__index_level_1__| a| b|__natural_order__|
+-----------------+-----------------+---+---+-----------------+
| a| 1| 1| x| 42949672960|
| b| 2| 2| y| 94489280512|
+-----------------+-----------------+---+---+-----------------+
>>> kdf.b._kdf._internal.spark_frame.show()
+-----------------+-----------------+---+---+-----------------+
|__index_level_0__|__index_level_1__| a| b|__natural_order__|
+-----------------+-----------------+---+---+-----------------+
| a| 1| 1| x| 42949672960|
| b| 2| 2| y| 94489280512|
+-----------------+-----------------+---+---+-----------------+
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean their metadata:
>>> kdf.b._internal.column_labels
[('b',)]
>>> kdf.b._kdf._internal.column_labels
[('a',), ('b',)]
or
>>> kdf.b._internal.data_spark_columns
[Column<b'b'>]
>>> kdf.b._kdf._internal.data_spark_columns
[Column<b'a'>, Column<b'b'>]
The column_labels
or data_spark_columns
in the InternalFrame
for Series
contains only one for the Series
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, that's clear to me now. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise, LGTM.
No description provided.