Spark SQL 内置函数(二)Map Functions(基于 Spark 3.2.0)_sparksql map函数-程序员宅基地

技术标签: spark  大数据技术体系  sql  

前言

本文隶属于专栏《1000个问题搞定大数据技术体系》,该专栏为笔者原创,引用请注明来源,不足和错误之处请在评论区帮忙指出,谢谢!

本专栏目录结构和参考文献请见1000个问题搞定大数据技术体系

目录

Spark SQL 内置函数(一)Array Functions(基于 Spark 3.2.0)

Spark SQL 内置函数(二)Map Functions(基于 Spark 3.2.0)

Spark SQL 内置函数(三)Date and Timestamp Functions(基于 Spark 3.2.0)

Spark SQL 内置函数(四)JSON Functions(基于 Spark 3.2.0)

Spark SQL 内置函数(五)Aggregate Functions(基于 Spark 3.2.0)

Spark SQL 内置函数(六)Window Functions(基于 Spark 3.2.0)

正文

element_at(array, index)

描述

  • 返回 array 中下标为 index 的元素(数组下标从 1 开始)。
  • 如果下标小于 0,从后向前计算,即 -1 代表数组中的最后一个元素。
  • 如果索引大于数组长度或者spark.sql.ansi.enabled 设置成 false,则函数返回 NULL。
  • 如果spark.sql.ansi.enabled设置成 true,当遇到无效索引的时候会抛出异常:ArrayIndexOutOfBoundsException

实践

SELECT element_at(array(1, 2, 3), 2);
+-----------------------------+
|element_at(array(1, 2, 3), 2)|
+-----------------------------+
|                            2|
+-----------------------------+

SELECT element_at(array(1, 2, 3), 4);
+-----------------------------+
|element_at(array(1, 2, 3), 4)|
+-----------------------------+
|                         NULL|
+-----------------------------+
spark-sql> SELECT element_at(array(1, 2, 3), 4);
21/11/21 20:14:34 ERROR SparkSQLDriver: Failed in [SELECT element_at(array(1, 2, 3), 4)]
java.lang.ArrayIndexOutOfBoundsException: Invalid index: 4, numElements: 3
        at org.apache.spark.sql.catalyst.expressions.ElementAt.$anonfun$doElementAt$1(collectionOperations.scala:2014)
        at org.apache.spark.sql.catalyst.expressions.ElementAt.nullSafeEval(collectionOperations.scala:2004)
        at org.apache.spark.sql.catalyst.expressions.BinaryExpression.eval(Expression.scala:579)
        at org.apache.spark.sql.catalyst.optimizer.ConstantFolding$$anonfun$apply$1$$anonfun$applyOrElse$1.applyOrElse(expressions.scala:66)
        at org.apache.spark.sql.catalyst.optimizer.ConstantFolding$$anonfun$apply$1$$anonfun$applyOrElse$1.applyOrElse(expressions.scala:54)
        at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDown$1(TreeNode.scala:316)
        at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:72)
        at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:316)
        at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDown$3(TreeNode.scala:321)
        at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$mapChildren$1(TreeNode.scala:406)
        at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:242)
        at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:404)
        at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:357)
        at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:321)
        at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$transformExpressionsDown$1(QueryPlan.scala:94)
        at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$1(QueryPlan.scala:116)
        at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:72)
        at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpression$1(QueryPlan.scala:116)
        at org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:127)
        at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$3(QueryPlan.scala:132)
        at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
        at scala.collection.immutable.List.foreach(List.scala:431)
        at scala.collection.TraversableLike.map(TraversableLike.scala:286)
        at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
        at scala.collection.immutable.List.map(List.scala:305)
        at org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:132)
        at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$4(QueryPlan.scala:137)
        at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:242)
        at org.apache.spark.sql.catalyst.plans.QueryPlan.mapExpressions(QueryPlan.scala:137)
        at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsDown(QueryPlan.scala:94)
        at org.apache.spark.sql.catalyst.optimizer.ConstantFolding$$anonfun$apply$1.applyOrElse(expressions.scala:54)
        at org.apache.spark.sql.catalyst.optimizer.ConstantFolding$$anonfun$apply$1.applyOrElse(expressions.scala:53)
        at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDown$1(TreeNode.scala:316)
        at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:72)
        at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:316)
        at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDown(LogicalPlan.scala:29)
        at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDown(AnalysisHelper.scala:171)
        at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDown$(AnalysisHelper.scala:169)
        at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29)
        at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29)
        at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:305)
        at org.apache.spark.sql.catalyst.optimizer.ConstantFolding$.apply(expressions.scala:53)
        at org.apache.spark.sql.catalyst.optimizer.ConstantFolding$.apply(expressions.scala:44)
        at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:215)
        at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
        at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
        at scala.collection.immutable.List.foldLeft(List.scala:91)
        at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:212)
        at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:204)
        at scala.collection.immutable.List.foreach(List.scala:431)
        at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:204)
        at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:182)
        at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:88)
        at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:182)
        at org.apache.spark.sql.execution.QueryExecution.$anonfun$optimizedPlan$1(QueryExecution.scala:88)
        at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
        at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:144)
        at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772)
        at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:144)
        at org.apache.spark.sql.execution.QueryExecution.optimizedPlan$lzycompute(QueryExecution.scala:85)
        at org.apache.spark.sql.execution.QueryExecution.optimizedPlan(QueryExecution.scala:85)
        at org.apache.spark.sql.execution.QueryExecution.assertOptimized(QueryExecution.scala:96)
        at org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:114)
        at org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:111)
        at org.apache.spark.sql.execution.QueryExecution.$anonfun$simpleString$2(QueryExecution.scala:162)
        at org.apache.spark.sql.execution.ExplainUtils$.processPlan(ExplainUtils.scala:115)
        at org.apache.spark.sql.execution.QueryExecution.simpleString(QueryExecution.scala:162)
        at org.apache.spark.sql.execution.QueryExecution.org$apache$spark$sql$execution$QueryExecution$$explainString(QueryExecution.scala:207)
        at org.apache.spark.sql.execution.QueryExecution.explainString(QueryExecution.scala:176)
        at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:98)
        at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163)
        at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90)
        at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772)
        at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
        at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:69)
        at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:381)
        at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1(SparkSQLCLIDriver.scala:501)
        at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1$adapted(SparkSQLCLIDriver.scala:495)
        at scala.collection.Iterator.foreach(Iterator.scala:943)
        at scala.collection.Iterator.foreach$(Iterator.scala:943)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
        at scala.collection.IterableLike.foreach(IterableLike.scala:74)
        at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
        at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
        at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processLine(SparkSQLCLIDriver.scala:495)
        at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:284)
        at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:952)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1031)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1040)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
java.lang.ArrayIndexOutOfBoundsException: Invalid index: 4, numElements: 3
        at org.apache.spark.sql.catalyst.expressions.ElementAt.$anonfun$doElementAt$1(collectionOperations.scala:2014)
        at org.apache.spark.sql.catalyst.expressions.ElementAt.nullSafeEval(collectionOperations.scala:2004)
        at org.apache.spark.sql.catalyst.expressions.BinaryExpression.eval(Expression.scala:579)
        at org.apache.spark.sql.catalyst.optimizer.ConstantFolding$$anonfun$apply$1$$anonfun$applyOrElse$1.applyOrElse(expressions.scala:66)
        at org.apache.spark.sql.catalyst.optimizer.ConstantFolding$$anonfun$apply$1$$anonfun$applyOrElse$1.applyOrElse(expressions.scala:54)
        at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDown$1(TreeNode.scala:316)
        at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:72)
        at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:316)
        at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDown$3(TreeNode.scala:321)
        at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$mapChildren$1(TreeNode.scala:406)
        at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:242)
        at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:404)
        at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:357)
        at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:321)
        at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$transformExpressionsDown$1(QueryPlan.scala:94)
        at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$1(QueryPlan.scala:116)
        at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:72)
        at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpression$1(QueryPlan.scala:116)
        at org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:127)
        at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$3(QueryPlan.scala:132)
        at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
        at scala.collection.immutable.List.foreach(List.scala:431)
        at scala.collection.TraversableLike.map(TraversableLike.scala:286)
        at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
        at scala.collection.immutable.List.map(List.scala:305)
        at org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:132)
        at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$4(QueryPlan.scala:137)
        at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:242)
        at org.apache.spark.sql.catalyst.plans.QueryPlan.mapExpressions(QueryPlan.scala:137)
        at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsDown(QueryPlan.scala:94)
        at org.apache.spark.sql.catalyst.optimizer.ConstantFolding$$anonfun$apply$1.applyOrElse(expressions.scala:54)
        at org.apache.spark.sql.catalyst.optimizer.ConstantFolding$$anonfun$apply$1.applyOrElse(expressions.scala:53)
        at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDown$1(TreeNode.scala:316)
        at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:72)
        at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:316)
        at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDown(LogicalPlan.scala:29)
        at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDown(AnalysisHelper.scala:171)
        at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDown$(AnalysisHelper.scala:169)
        at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29)
        at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29)
        at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:305)
        at org.apache.spark.sql.catalyst.optimizer.ConstantFolding$.apply(expressions.scala:53)
        at org.apache.spark.sql.catalyst.optimizer.ConstantFolding$.apply(expressions.scala:44)
        at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:215)
        at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
        at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
        at scala.collection.immutable.List.foldLeft(List.scala:91)
        at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:212)
        at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:204)
        at scala.collection.immutable.List.foreach(List.scala:431)
        at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:204)
        at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:182)
        at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:88)
        at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:182)
        at org.apache.spark.sql.execution.QueryExecution.$anonfun$optimizedPlan$1(QueryExecution.scala:88)
        at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
        at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:144)
        at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772)
        at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:144)
        at org.apache.spark.sql.execution.QueryExecution.optimizedPlan$lzycompute(QueryExecution.scala:85)
        at org.apache.spark.sql.execution.QueryExecution.optimizedPlan(QueryExecution.scala:85)
        at org.apache.spark.sql.execution.QueryExecution.assertOptimized(QueryExecution.scala:96)
        at org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:114)
        at org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:111)
        at org.apache.spark.sql.execution.QueryExecution.$anonfun$simpleString$2(QueryExecution.scala:162)
        at org.apache.spark.sql.execution.ExplainUtils$.processPlan(ExplainUtils.scala:115)
        at org.apache.spark.sql.execution.QueryExecution.simpleString(QueryExecution.scala:162)
        at org.apache.spark.sql.execution.QueryExecution.org$apache$spark$sql$execution$QueryExecution$$explainString(QueryExecution.scala:207)
        at org.apache.spark.sql.execution.QueryExecution.explainString(QueryExecution.scala:176)
        at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:98)
        at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163)
        at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90)
        at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772)
        at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
        at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:69)
        at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:381)
        at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1(SparkSQLCLIDriver.scala:501)
        at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1$adapted(SparkSQLCLIDriver.scala:495)
        at scala.collection.Iterator.foreach(Iterator.scala:943)
        at scala.collection.Iterator.foreach$(Iterator.scala:943)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
        at scala.collection.IterableLike.foreach(IterableLike.scala:74)
        at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
        at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
        at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processLine(SparkSQLCLIDriver.scala:495)
        at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:284)
        at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:952)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1031)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1040)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

在这里插入图片描述

element_at(map, key)

描述

  • 返回 map 中 key 对应的 value。
  • 如果 map 中不包含当前 key:
    • 如果spark.sql.ansi.enabled设置成 false,则返回 NULL。
    • 如果spark.sql.ansi.enabled设置成 true,则抛出异常 NoSuchElementException

实践


SELECT element_at(map(1, 'a', 2, 'b'), 2);
+------------------------------+
|element_at(map(1, a, 2, b), 2)|
+------------------------------+
|                             b|
+------------------------------+

在这里插入图片描述

map(key0, value0, key1, value1, …)

描述

创建一个包含给定键值对的 map。

实践

SELECT map(1.0, '2', 3.0, '4');
+--------------------+
| map(1.0, 2, 3.0, 4)|
+--------------------+
|  {
   1.0:"2",3.0:"4"} |
+--------------------+

在这里插入图片描述

map_concat(map, …)

描述

返回所有给定 map 的 union。

实践

SELECT map_concat(map(1, 'a', 2, 'b'), map(3, 'c'));
+--------------------------------------+
|map_concat(map(1, a, 2, b), map(3, c))|
+--------------------------------------+
|                  {
   1:"a",2:"b",3:"c"} |
+--------------------------------------+

在这里插入图片描述

map_entries(map)

描述

以无序数组的形式返回 map 中所有的 entry。

实践

SELECT map_entries(map(1, 'a', 2, 'b'));
+---------------------------------------------+
|		map_entries(map(1, a, 2, b))	      |
+---------------------------------------------+
|[{
   "key":1,"value":"a"},{
   "key":2,"value":"b"}]|
+---------------------------------------------+

在这里插入图片描述

map_from_arrays(keys, values)

描述

  • 通过给定的键值对数组来创建一个 map。
  • keys 中所有的元素都不能为 null。

实践

SELECT map_from_arrays(array(1.0, 3.0), array('2', '4'));
+---------------------------------------------+
|map_from_arrays(array(1.0, 3.0), array(2, 4))|
+---------------------------------------------+
|                  			 {
   1.0:"2",3.0:"4"}|
+---------------------------------------------+

在这里插入图片描述

map_from_entries(arrayOfEntries)

描述

基于给定的 entry 数组来创建一个 map。

实践

SELECT map_from_entries(array(struct(1, 'a'), struct(2, 'b')));
+---------------------------------------------------+
|map_from_entries(array(struct(1, a), struct(2, b)))|
+---------------------------------------------------+
|                                   {
   1:"a",2:"b"}	|
+---------------------------------------------------+

在这里插入图片描述

map_keys(map)

描述

  • 返回一个无序数组
  • 包含 map 中的所有 key

实践

SELECT map_keys(map(1, 'a', 2, 'b'));
+-------------------------+
|map_keys(map(1, a, 2, b))|
+-------------------------+
|                   [1, 2]|
+-------------------------+

在这里插入图片描述

map_values(map)

描述

  • 返回一个无序数组
  • 包含 map 中的所有 value

实践

SELECT map_values(map(1, 'a', 2, 'b'));
+---------------------------+
|map_values(map(1, a, 2, b))|
+---------------------------+
|                  ["a","b"]|
+---------------------------+

在这里插入图片描述

str_to_map(text[, pairDelim[, keyValueDelim]])

描述

  • 使用分隔符将字符串切分成键值对,使用该键值对创建一个 map。
  • 默认的分隔符 pairDelim 是 “,”
  • 键值对分隔符 keyValueDelim 是 “:”
  • pairDelim 和 keyValueDelim 都将被当成正则表达式。

实践

SELECT str_to_map('a:1,b:2,c:3', ',', ':');
+-----------------------------+
|str_to_map(a:1,b:2,c:3, ,, :)|
+-----------------------------+
|    {
   "a":"1","b":"2","c":"3"}|
+-----------------------------+

SELECT str_to_map('a');
+-------------------+
|str_to_map(a, ,, :)|
+-------------------+
|        {
   "a":null}	|
+-------------------+

在这里插入图片描述

版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://blog.csdn.net/Shockang/article/details/121458821

智能推荐

Hive UDF函数编写流程详解-程序员宅基地

文章浏览阅读1.1w次,点赞2次,收藏25次。参考官网:https://cwiki.apache.org/confluence/display/Hive/HivePlugins 添加hive UDF函数https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF 可查看hive内置函数常用命令:SHOW FUNCTIONS; 查看hive函数DES_hive udf函数编写

Linux_echo "hello 1.txt" > 1.txt-程序员宅基地

文章浏览阅读94次。Linux:是个多用户、多任务的支持远程操作的系统临时添加全局变量(重启失效)echo $PATH 查看全局变量位置export PATH=$PATH:(+你的全局变量目录)ctrl + Z 放到后台执行fg %1 切换回程序python3.6 -m venv . (在当前目录开启虚拟环境)source ./bin/activate (进入虚拟..._echo "hello 1.txt" > 1.txt

【OpenCV系列】特征提取介绍:HOG、SIFT、SURF、ORB、LBP、HAAR_sift、hog是特征呢?还是特征提取算法呢?-程序员宅基地

文章浏览阅读1.5k次,点赞4次,收藏15次。机器视觉特征提取介绍:HOG、SIFT、SURF、ORB、LBP、HAAR 一. 概述这里主要记录自己的一些感悟,不是很系统。想要详细系统的理论,请参考文末的《图像处理之特征提取》。个人不是专业cv工程师,很多细节没有深究,描述可能不严谨。 在总结物体检测算法之前先把基础的特征点理论整理一下。二. HOG求取前先灰度化然后Gamma校..._sift、hog是特征呢?还是特征提取算法呢?

ios ipa文件分析_ios ipa分析-程序员宅基地

文章浏览阅读5.2k次。本文介绍如何对ipa文件进行分析,测试工程师,逆向工作者,都可以通过本文对ipa有一个基本的了解。开发打出ipa文件提交ipa文件给apple核审apple给ipa加壳我们要分析ipa,如果有条件,可以直接从第一步进行分析下面一步一步完成对ipa的分析。ipa是一个归档文件,可以通过Mac os里面的归档使用工具完成解压缩。ipa 解压缩后生成了Payload文件夹很直观的一..._ios ipa分析

QT笔记—— QT应用程序或.exe更换图标_qt更换图标ico-程序员宅基地

文章浏览阅读4.8k次,点赞5次,收藏15次。环境:vs2019 + qt5.12.2我们时常想更换应用程序的图标和.exe的图标首先我们需要准备一个图片推荐:阿里云图标库我们可以下载一个.png图片我们更换应用程序的图标1.我们右击选中.qrc文件-》打开方式-》以Qt Resource Editor打开2.选中自己路径的图片点击Add -》 选择Add Prefix (写上我们的文件夹名称)-》选择Add Files(选中我们的文件图片)this->setWindowIcon(QIcon(":/image/1.png"_qt更换图标ico

php 排序方法,php排序方法有几种-程序员宅基地

文章浏览阅读1.3k次。php排序方法有:1、冒泡排序,即每当两相邻的数比较后发现它们的排序与排序要求相反时,就将它们互换;2、选择排序 ;3、插入排序,即把第n个数插到前面的有序数中,使得这n个数也是排好顺序的;4、快速排序 。前提:分别用冒泡排序法,快速排序法,选择排序法,插入排序法将下面数组中的值按照从小到大的顺序进行排序。$arr(1,43,54,62,21,66,32,78,36,76,39);推荐:《PHP..._php 值排序

随便推点

Linux ELF .gnu.hash生成_gnu hash段-程序员宅基地

文章浏览阅读2k次。1. 生成原始动态库的源文件attr.h#include <cstdio>void not_hidden() __attribute__((visibility("default")));void is_hidden() __attribute__((visibility("hidden")));void Log(FILE* f, const char* form..._gnu hash段

广外计算机学院在哪个校区,广东外语外贸大学有几个校区及校区地址 哪个校区最好...-程序员宅基地

文章浏览阅读1.6k次。最近有许多考生和家长咨询小编,广东外语外贸大学有几个校区,今年新生会被分配到哪个校区?哪个校区好?等相关问题,下那么面小编统一回复一下考生们的问题。广东外语外贸大学现在有个校区,分别为:校区和校区广东外语外贸大学校区地址及简介1.广东外语外贸大学北校区地址:广州市白云区白云大道北2号广东外语外贸大学内。广东外语外贸大学是一所具有鲜明国际化特色的广东省属重点大学,是华南地区国际化人才培养和外国语言文..._广东外语外贸大学计算机学院在南区还是北区

QT项目模板_qt使用的项目模板-程序员宅基地

文章浏览阅读1.7k次。模板模板变量告诉qmake为这个应用程序生成哪种makefile。下面是可供使用的选择:app - 建立一个应用程序的makefile。这是默认值,所以如果模板没有被指定,这个将被使用。lib - 建立一个库的makefile。vcapp - 建立一个应用程序的Visual Studio项目文件。vclib - 建立一个库的Visual_qt使用的项目模板

成功解决 Ubuntu/Linux修改时间 同步网络时间_ububtu使用命令设置不同步网络时间-程序员宅基地

文章浏览阅读4.4k次,点赞9次,收藏15次。welcome to my blog第一步, 执行sudo timedatectl set-timezone Asia/Shanghai, 将时区改为上海第二步, 执行sudo ntpdate -u ntp.ntsc.ac.cn和国家授时中心时间对齐执行date或者timedatectl命令查看时间..._ububtu使用命令设置不同步网络时间

SharePoint Online 定制左侧导航_sharepoint左边的列表如何制作-程序员宅基地

文章浏览阅读673次。  前言  之前几篇文章,都是为大家介绍工作流相关的知识,这一篇文章,我们先暂别工作流,定制一下左侧导航,因为实在是太丑了。  正文  1.先看看我们定制完的左侧导航吧,虽然不是特别的美观,但是!但是跟页面看起来很协调,有木有!  如果你觉得这款导航不好看,其实可以用本文的方式,定制成你喜欢的样子  2.首先创建一个自定义列表用来保存导航,过程比较简单就不截图详解了,有前面..._sharepoint左边的列表如何制作

Matlab调用C接口_matlab ansi c 接口-程序员宅基地

文章浏览阅读884次。matlab调用C语言mex标签(空格分隔): 学习笔记一、为什么要用C语言编写MEX文件MATLAB是矩阵语言,是为向量和矩阵操作设计的,一般来说,如果运算可以用向量或矩阵实现,其运算速度是非常快的。但若运算中涉及到大量的循环处理,MATLAB的速度的令人难以忍受的。解决方法之一为,当必须使用for循环时,把它写为MEX文件,这样不必在每次运行循环中的语句时MATLAB都对它们进行解释。二、 编译_matlab ansi c 接口

推荐文章

热门文章

相关标签