An Akka Streams source of Azure Data Lake data
  • Scala 97.8%
  • Dockerfile 2.2%
Find a file
2023-12-31 19:12:05 -08:00
.github/workflows no travis 2022-01-09 12:02:44 -08:00
project Update sbt-assembly to 2.1.0 2022-12-11 01:23:49 +00:00
src Update scalatest to 3.1.0 2019-11-30 21:51:47 +01:00
.gitignore init 2018-11-03 08:15:54 -07:00
.travis.yml travis for sbt 2021-03-08 07:07:27 -08:00
build.sbt Update scala-library to 2.13.10 2022-10-16 02:15:38 +00:00
Dockerfile dockerfile 2019-07-09 19:59:20 -07:00
LICENSE docs(license): update copyright year(s) 2024-01-01 03:03:27 +00:00
README.md np 2019-07-09 20:34:28 -07:00

Build Status Codacy Badge

Read Azure Data Lake Storage into Akka Streams

Replay historical data-at-rest into an existing code base that had been designed for streaming.

Current Storage Sources

  1. GZip files of UTF8 \n delimited strings
  2. Other storage implementations TBD

Uses the adslapi.

USAGE

update your build.sbt dependencies with:

// https://mvnrepository.com/artifact/tech.navicore/navilake
libraryDependencies += "tech.navicore" %% "navilake" % "1.3.0"

This example reads gzip data from Azure Data Lake.

Create a config, a connector, and a source via the example below.

    val consumer = ... // some Sink
    ...
    ...
    ...
    // credentials and location
    implicit val cfg: LakeConfig = LakeConfig(ACCOUNTFQDN, CLIENTID, AUTHEP, CLIENTKEY, Some(PATH))
    val connector: ActorRef = actorSystem.actorOf(GzipConnector.props)
    val src = NaviLake(connector)
    ...
    ...
    ...
    src.runWith(consumer)
    ...
    ...
    ...