An Akka Streams source of Azure Data Lake data

akka akka-streams azure azure-data-lake scala

Scala 97.8%
Dockerfile 2.2%

Find a file

Ed Sweeney f7ec1100cf Merge pull request #172 from navicore/license/copyright-to-2024 Update license copyright year(s)		2023-12-31 19:12:05 -08:00
.github/workflows	no travis	2022-01-09 12:02:44 -08:00
project	Update sbt-assembly to 2.1.0	2022-12-11 01:23:49 +00:00
src	Update scalatest to 3.1.0	2019-11-30 21:51:47 +01:00
.gitignore	init	2018-11-03 08:15:54 -07:00
.travis.yml	travis for sbt	2021-03-08 07:07:27 -08:00
build.sbt	Update scala-library to 2.13.10	2022-10-16 02:15:38 +00:00
Dockerfile	dockerfile	2019-07-09 19:59:20 -07:00
LICENSE	docs(license): update copyright year(s)	2024-01-01 03:03:27 +00:00
README.md	np	2019-07-09 20:34:28 -07:00

README.md

Read Azure Data Lake Storage into Akka Streams

Replay historical data-at-rest into an existing code base that had been designed for streaming.

Current Storage Sources

GZip files of UTF8 \n delimited strings
Other storage implementations TBD

Uses the adslapi.

USAGE

update your build.sbt dependencies with:

// https://mvnrepository.com/artifact/tech.navicore/navilake
libraryDependencies += "tech.navicore" %% "navilake" % "1.3.0"

This example reads gzip data from Azure Data Lake.

Create a config, a connector, and a source via the example below.

    val consumer = ... // some Sink
    ...
    ...
    ...
    // credentials and location
    implicit val cfg: LakeConfig = LakeConfig(ACCOUNTFQDN, CLIENTID, AUTHEP, CLIENTKEY, Some(PATH))
    val connector: ActorRef = actorSystem.actorOf(GzipConnector.props)
    val src = NaviLake(connector)
    ...
    ...
    ...
    src.runWith(consumer)
    ...
    ...
    ...