I'm having issue on merging a file folder

/home/runner/SomeQuintessentialOrder/main.py:1: DeprecationWarning: 
Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
(to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
but was not found to be installed on your system.
If this would cause problems for you,
please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466
        
  import pandas as pd
Traceback (most recent call last):
  File "/home/runner/SomeQuintessentialOrder/main.py", line 43, in <module>
    merged_file_path = merge_csvs_with_columns_and_update_columns(folder_path, output_file, columns)
  File "/home/runner/SomeQuintessentialOrder/main.py", line 21, in merge_csvs_with_columns_and_update_columns
    master_df[['Customer', 'Sender']] = master_df['campaignName'].apply(
  File "/home/runner/SomeQuintessentialOrder/.pythonlibs/lib/python3.10/site-packages/pandas/core/frame.py", line 4287, in __setitem__
    self._setitem_array(key, value)
  File "/home/runner/SomeQuintessentialOrder/.pythonlibs/lib/python3.10/site-packages/pandas/core/frame.py", line 4346, in _setitem_array
    self._iset_not_inplace(key, value)
  File "/home/runner/SomeQuintessentialOrder/.pythonlibs/lib/python3.10/site-packages/pandas/core/frame.py", line 4365, in _iset_not_inplace
    raise ValueError("Columns must be same length as key")
ValueError: Columns must be same length as key

Hello @raymond28, welcome to Replit Ask!

Please run pip install pyarrow in the shell. Taken from here.

If you’re asking about the Pandas error (and not the warning) then first: without seeing the relevant parts of your code it’s hard to give a reliable diagnosis. My guess would be that the result of master_df['campaignName'].apply(…) leads to a column size mismatch. If you look at the following example, which attempts to fit 3 columns into 2

import pandas as pd

df = pd.DataFrame({"a": [0, 1], "b": [2, 3], "c": [4, 5], "d": [6, 7], "e": [8, 9]})
df[["a", "b"]] = df[["c", "d", "e"]]  # <- Error here

you’ll see a similar traceback (not exactly the same, though, the last part differs):

Traceback (most recent call last):
  File "/home/runner/PythonSandbox/main.py", line 11, in <module>
    df[["a", "b"]] = df[["c", "d", "e"]]
  File "/home/runner/PythonSandbox/.pythonlibs/lib/python3.10/site-packages/pandas/core/frame.py", line 4287, in __setitem__
    self._setitem_array(key, value)
  File "/home/runner/PythonSandbox/.pythonlibs/lib/python3.10/site-packages/pandas/core/frame.py", line 4329, in _setitem_array
    check_key_length(self.columns, key, value)
  File "/home/runner/PythonSandbox/.pythonlibs/lib/python3.10/site-packages/pandas/core/indexers/utils.py", line 390, in check_key_length
    raise ValueError("Columns must be same length as key")
ValueError: Columns must be same length as key

But again: that’s just a guess, without more details it’s hard to say.

It appears that you’re encountering an error in your Python script. The error message indicates that there is an issue with the length of the columns you’re trying to assign in Pandas. Specifically, the error message “ValueError: Columns must be the same length as key” suggests that the length of the columns you’re trying to assign does not match the length of the key (i.e., the DataFrame column).

Here’s a breakdown of the error message and potential causes:

  1. DeprecationWarning for PyArrow: This warning is informing you that PyArrow will become a required dependency of Pandas in the next major release. If you encounter issues with this change, you’re encouraged to provide feedback to the Pandas development team.
  2. Import Error for Pandas: Your script is failing to import the pandas library, which is necessary for its execution. Make sure that Pandas is installed in your Python environment. You can install it using pip install pandas.
  3. ValueError: Columns must be same length as key: This is the main error you’re encountering. It occurs when you’re trying to assign values to columns in a DataFrame, but the length of the values you’re assigning does not match the length of the DataFrame column.

To resolve this issue, you’ll need to ensure that the length of the values you’re assigning matches the length of the DataFrame column. Double-check the logic of your code where you’re assigning values to the ‘Customer’ and ‘Sender’ columns based on the ‘campaignName’ column, as this seems to be where the problem lies. You may need to debug this section of your code to identify the discrepancy in lengths.

If you need further assistance, feel free to provide more details or the relevant code snippet, and I’d be happy to help you debug further.

This has almost certainly been generated by ChatGPT, have you given it a human check?

2 Likes