such annotations are only allowed in arguments to *-parameters

3,649次阅读

共计 1024 个字符，预计需要花费 3 分钟才能阅读完成。

出现这个报错是在两个dataframe执行union操作的时候，出现column顺序不一致，然后我需要将其中的一个dataframe变成另外一个的顺序。
由于dataframe的列有三百多列，我不可能手写column通过select方式来调整列的顺序。

    val dfcols=df.columns
    val df3=df2.select(dfcols:_*)

最初的代码我是这么写的，好像以前也是可以通过的，现在直接报错了，spark现在的版本2.4。去看网上的好多问答也都是上面的写法。

自己直接查看select的源码，现在默认的输入需要给定col参数，在select函数的下面看到了selectExpr

def selectExpr(exprs: String*): DataFrame

刚刚去spark官网看了下文档，最新3.1.1好像可以支持

def select(self, *cols):
        """Projects a set of expressions and returns a new :class:`DataFrame`.

        .. versionadded:: 1.3.0

        Parameters
        ----------
        cols : str, :class:`Column`, or list
            column names (string) or expressions (:class:`Column`).
            If one of the column names is '*', that column is expanded to include all columns
            in the current :class:`DataFrame`.

        Examples
        --------
        >>> df.select('*').collect()
        [Row(age=2, name='Alice'), Row(age=5, name='Bob')]
        >>> df.select('name', 'age').collect()
        [Row(name='Alice', age=2), Row(name='Bob', age=5)]
        >>> df.select(df.name, (df.age + 10).alias('age')).collect()
        [Row(name='Alice', age=12), Row(name='Bob', age=15)]
        """
        jdf = self._jdf.select(self._jcols(*cols))
        return DataFrame(jdf, self.sql_ctx)

正文完

请博主喝杯咖啡吧！