Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: azavea/hiveless
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v0.0.9
Choose a base ref
...
head repository: azavea/hiveless
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: main
Choose a head ref
  • 16 commits
  • 104 files changed
  • 1 contributor

Commits on Apr 10, 2022

  1. Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    9f5c82d View commit details
  2. Copy the full SHA
    9d702ed View commit details

Commits on Apr 11, 2022

  1. Copy the full SHA
    91bf6f7 View commit details
  2. Copy the full SHA
    bb8abd2 View commit details

Commits on Apr 12, 2022

  1. Copy the full SHA
    b16e3b5 View commit details
  2. Fix typo in a comment

    pomadchin committed Apr 12, 2022
    Copy the full SHA
    a38db7f View commit details

Commits on Apr 13, 2022

  1. Copy the full SHA
    fe486d7 View commit details

Commits on Apr 14, 2022

  1. Cleanup STIndexInjectorSpec

    pomadchin committed Apr 14, 2022
    Copy the full SHA
    cc2e3e1 View commit details

Commits on Apr 29, 2022

  1. Add ST_Contains optimization and fix the ST_Intersects optimization b…

    …ehavior (#18)
    
    * Fix the ST_Intersects optimization behavior
    
    * Add ST_Contains function; remove name definition from all function names
    
    * Add ST_Contains function optimizations
    pomadchin authored Apr 29, 2022
    Copy the full SHA
    5ef8031 View commit details
  2. Dry code

    pomadchin committed Apr 29, 2022
    Copy the full SHA
    4cb6b6d View commit details
  3. Refactor Optimization Rules

    pomadchin committed Apr 29, 2022
    Copy the full SHA
    eb73ecd View commit details
  4. Update STIndexSpec.scala

    pomadchin authored Apr 29, 2022
    Copy the full SHA
    8313877 View commit details

Commits on May 1, 2022

  1. Update README.md

    pomadchin authored May 1, 2022
    Copy the full SHA
    2d0bec1 View commit details
  2. Update README.md

    pomadchin authored May 1, 2022
    Copy the full SHA
    c23a75c View commit details
  3. Update README.md

    pomadchin authored May 1, 2022
    Copy the full SHA
    78a00d6 View commit details
  4. Update README.md

    pomadchin authored May 1, 2022
    Copy the full SHA
    7aad837 View commit details
Showing with 887 additions and 641 deletions.
  1. +22 −0 README.md
  2. +5 −4 core/src/main/scala/com/azavea/hiveless/HUDF.scala
  3. +0 −78 core/src/main/scala/com/azavea/hiveless/serializers/GenericDeserializer.scala
  4. +0 −34 core/src/main/scala/com/azavea/hiveless/serializers/HDeserialier.scala
  5. +237 −0 core/src/main/scala/com/azavea/hiveless/serializers/HDeserializer.scala
  6. +0 −184 core/src/main/scala/com/azavea/hiveless/serializers/UnaryDeserializer.scala
  7. +4 −4 core/src/main/scala/com/azavea/hiveless/serializers/syntax.scala
  8. +3 −1 core/src/main/scala/com/azavea/hiveless/spark/rules/syntax/syntax.scala
  9. +5 −0 core/src/main/scala/com/azavea/hiveless/utils/HShow.scala
  10. +1 −0 spatial-index/sql/createUDFs.sql
  11. +0 −129 ...ial-index/src/main/scala/com/azavea/hiveless/spark/spatial/rules/SpatialFilterPushdownRules.scala
  12. +24 −0 spatial-index/src/main/scala/com/azavea/hiveless/spark/sql/SpatialFilterPushdownOptimizations.scala
  13. +119 −0 spatial-index/src/main/scala/com/azavea/hiveless/spark/sql/rules/STContainsRule.scala
  14. +128 −0 spatial-index/src/main/scala/com/azavea/hiveless/spark/sql/rules/STIntersectsRule.scala
  15. +45 −0 spatial-index/src/main/scala/com/azavea/hiveless/spark/sql/rules/SpatialFilterPushdownRules.scala
  16. +35 −0 spatial-index/src/main/scala/com/azavea/hiveless/spatial/index/ST_Contains.scala
  17. +1 −2 spatial-index/src/main/scala/com/azavea/hiveless/spatial/index/ST_CrsFromText.scala
  18. +1 −2 spatial-index/src/main/scala/com/azavea/hiveless/spatial/index/ST_ExtentFromGeom.scala
  19. +1 −2 spatial-index/src/main/scala/com/azavea/hiveless/spatial/index/ST_ExtentToGeom.scala
  20. +1 −2 spatial-index/src/main/scala/com/azavea/hiveless/spatial/index/ST_GeomReproject.scala
  21. +6 −7 spatial-index/src/main/scala/com/azavea/hiveless/spatial/index/ST_Intersects.scala
  22. +1 −2 spatial-index/src/main/scala/com/azavea/hiveless/spatial/index/ST_MakeExtent.scala
  23. +1 −2 spatial-index/src/main/scala/com/azavea/hiveless/spatial/index/ST_PartitionCentroid.scala
  24. +1 −2 spatial-index/src/main/scala/com/azavea/hiveless/spatial/index/ST_Z2LatLon.scala
  25. +4 −4 spatial-index/src/main/scala/com/azavea/hiveless/spatial/index/package.scala
  26. +32 −0 spatial-index/src/test/scala/com/azavea/hiveless/InjectOptimizerTestEnvironment.scala
  27. +1 −1 spatial-index/src/test/scala/com/azavea/hiveless/SpatialIndexHiveTestEnvironment.scala
  28. +54 −0 spatial-index/src/test/scala/com/azavea/hiveless/spark/sql/STIndexInjectorSpec.scala
  29. +57 −24 spatial-index/src/test/scala/com/azavea/hiveless/spatial/index/STIndexSpec.scala
  30. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_AntimeridianSafeGeom.scala
  31. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_Area.scala
  32. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_AsBinary.scala
  33. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_AsGeoHash.scala
  34. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_AsGeoJson.scala
  35. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_AsLatLonText.scala
  36. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_AsTWKB.scala
  37. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_AsText.scala
  38. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_Boundary.scala
  39. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_BufferPoint.scala
  40. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_ByteArray.scala
  41. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_CastToGeometry.scala
  42. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_CastToLineString.scala
  43. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_CastToPoint.scala
  44. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_CastToPolygon.scala
  45. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_Centroid.scala
  46. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_ClosestPoint.scala
  47. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_Contains.scala
  48. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_CoordDim.scala
  49. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_Covers.scala
  50. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_Crosses.scala
  51. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_Difference.scala
  52. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_Dimension.scala
  53. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_Disjoint.scala
  54. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_Distance.scala
  55. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_DistanceSphere.scala
  56. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_Envelope.scala
  57. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_Equals.scala
  58. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_ExteriorRing.scala
  59. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_GeoHash.scala
  60. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_GeomFromGeoHash.scala
  61. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_GeomFromGeoJson.scala
  62. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_GeomFromWKB.scala
  63. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_GeomFromWKT.scala
  64. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_GeometryN.scala
  65. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_InteriorRingN.scala
  66. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_Intersection.scala
  67. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_Intersects.scala
  68. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_IsClosed.scala
  69. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_IsCollection.scala
  70. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_IsEmpty.scala
  71. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_IsRing.scala
  72. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_IsSimple.scala
  73. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_IsValid.scala
  74. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_Length.scala
  75. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_LengthSphere.scala
  76. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_MLineFromText.scala
  77. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_MPointFromText.scala
  78. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_MPolyFromText.scala
  79. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_MakeBBOX.scala
  80. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_MakeBox2D.scala
  81. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_MakeLine.scala
  82. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_MakePoint.scala
  83. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_MakePointM.scala
  84. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_MakePolygon.scala
  85. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_NumGeometries.scala
  86. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_NumPoints.scala
  87. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_Overlaps.scala
  88. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_PointFromGeoHash.scala
  89. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_PointFromText.scala
  90. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_PointFromWKB.scala
  91. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_PointN.scala
  92. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_PolygonFromText.scala
  93. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_Relate.scala
  94. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_RelateBool.scala
  95. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_Simplify.scala
  96. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_SimplifyPreserveTopology.scala
  97. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_Touches.scala
  98. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_Translate.scala
  99. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_Within.scala
  100. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_X.scala
  101. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_Y.scala
  102. +1 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/ST_lineFromText.scala
  103. +2 −2 spatial/src/main/scala/com/azavea/hiveless/spatial/package.scala
  104. +23 −9 spatial/src/test/scala/com/azavea/hiveless/SpatialHiveTestEnvironment.scala
22 changes: 22 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -41,6 +41,28 @@ CREATE OR REPLACE FUNCTION st_simplify as 'com.azavea.hiveless.spatial.ST_Simpli

The full list of supported functions can be found [here](./spatial/sql/createUDFs.sql).

## Spatial Query Optimizations

There are two types of supported optimizations: `ST_Intersects` and `ST_Contains`, which allow Spark to push down predicates when possible.

To enable optimizations:

```scala
import com.azavea.hiveless.spark.sql.rules.SpatialFilterPushdownRules

val spark: SparkSession = ???
SpatialFilterPushdownRules.registerOptimizations(sparkContext.sqlContext)
```

It is also possible to set it through the Spark configuration via the optimizations injector:

```scala
import com.azavea.hiveless.spark.sql.SpatialFilterPushdownOptimizations

val conf: SparkConfig = ???
config.set("spark.sql.extensions", classOf[SpatialFilterPushdownOptimizations].getName)
```

## License
Code is provided under the Apache 2.0 license available at http://opensource.org/licenses/Apache-2.0,
as well as in the LICENSE file. This is the same license used as Spark.
9 changes: 5 additions & 4 deletions core/src/main/scala/com/azavea/hiveless/HUDF.scala
Original file line number Diff line number Diff line change
@@ -16,13 +16,14 @@

package com.azavea.hiveless

import com.azavea.hiveless.serializers.{GenericDeserializer, HDeserialier, HSerializer}
import com.azavea.hiveless.serializers.{HDeserializer, HSerializer}
import org.apache.hadoop.hive.ql.udf.generic.GenericUDF
import org.apache.spark.sql.types.DataType

import scala.util.{Failure, Success, Try}

abstract class HUDF[P, R](implicit d: GenericDeserializer[Try, P], s: HSerializer[R]) extends HGenericUDF[R] {
abstract class HUDF[P, R](implicit d: HDeserializer[Try, P], s: HSerializer[R]) extends HGenericUDF[R] {
def name: String = this.getClass.getName.split("\\.").last
def dataType: DataType = s.dataType
def serialize: R => Any = s.serialize
def function: P => R
@@ -33,7 +34,7 @@ abstract class HUDF[P, R](implicit d: GenericDeserializer[Try, P], s: HSerialize
// if arguments are null we can't deserialize it
// however nulls can appear due to filtering results (that's how it is handled)
// that is not an error state
case Failure(HDeserialier.Errors.NullArgument) => null.asInstanceOf[R]
case Failure(e) => throw e
case Failure(HDeserializer.Errors.NullArgument) => null.asInstanceOf[R]
case Failure(e) => throw e
}
}

This file was deleted.

This file was deleted.

Loading