Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: complete rewrite of the library #63

Merged
merged 8 commits into from
Sep 25, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion bantorra.opam
Original file line number Diff line number Diff line change
Expand Up @@ -13,14 +13,21 @@ dev-repo: "git+https://github.com/RedPRL/bantorra.git"
depends: [
"dune" {>= "2.0"}
"ocaml" {>= "5.0"}
"asai"
"algaeff" {>= "0.2"}
"bos" {>= "0.2"}
"bwd" {>= "2.1"}
"conf-git" {post}
"curly" {>= "0.2"}
"ezjsonm" {>= "1.2"}
"json-data-encoding" {>= "0.9"}
"ocamlfind" {>= "1.8"}
"odoc" {with-doc}
]
pin-depends: [
[ "asai.0.1.0~dev" "git+https://github.com/RedPRL/asai" ]
]
build: [
["dune" "subst"] {pinned}
["dune" "build" "-p" name "-j" jobs]
["dune" "build" "-p" name "-j" jobs "@runtest"] {with-test}
["dune" "build" "-p" name "-j" jobs "@doc"] {with-doc}
Expand Down
1 change: 1 addition & 0 deletions docs/dune
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
(documentation)
56 changes: 56 additions & 0 deletions docs/index.mld
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
A {e library} in the Bantorra framework is a tree of units that can be accessed via unit paths from the root. A unit path is a list of strings, such as [std/num/types]. The purpose of the Bantorra framework is to provide a flexible mechanism to map each unit path to some underlying file path. For example, the unit path [std/num/types] might be mapped to the file path [/usr/lib/cool/number/types.data], and the resolution process takes in both what is set up by the application and what is provided by its users.

{1 Introduction}

In the simplest case, there is a one-to-one correspondence between units and files under a directory: the unit path [a/b/c] corresponds to the file [a/b/c.data] where [.data] is the extension specified by the application. The root directory is marked by a special file called {e anchor}, which is a file with a fixed name again specified by the application. For example, the existence of a [dune] file means there is an OCaml library in the eyes of the [dune] building tool. An anchor in the Bantorra framework marks the root of a library. For example, if the anchor file name is [.lib], the file at [/usr/lib/cool/number/.lib] indicates that there is a library containing files under [/usr/lib/cool/number].

It is common for units within a library to access units in another library. To do so, an anchor file may {e mount} another library in the tree, in a way similar to how partitions are mounted in POSIX-compliant systems. Here is a sample anchor file:
{v
{
"format": "1.0.0",
"mounts": [
"std/num": ["local", "/usr/lib/cool/number"]
]
}
v}

The above anchor file mounts the library [number] at [std/num]. With this, the unit path [std/num/types] will be routed to the unit [types] within the library [number]. The resolution is recursive because the mounted library may mount yet another library. The actual interpretation of [["local", "/usr/lib/cool/number"]] is fully controlled by the application---the example assumes that [["local", path]] would be understood as a local [path], but the application can plug in any OCaml function from JSON data to directory paths. A few basic routing functions are provided in {!module:Router}.

{1 Example}

See a {{:https://github.com/RedPRL/bantorra/blob/main/test/Example.ml}tiny example} (available locally as [test/Example.ml]).

{1 Format of Anchors}

An anchor can be in one of the following formats:

{v
{ "format": "1.0.0" }
v}
{v
{
"format": "1.0.0",
"mounts": [
"path/to/lib1": ...
"path/to/lib2": ...
]
}
v}

If the [mounts] field is missing, then the library has no dependencies. Each dependency is specified by its mount point ([mount_point]), the name of the router to find the imported library ([router]), and the argument to the router ([router_argument]). During the resolution, the entire JSON subtree under the field [router_argument] is passed to the router. See {!type:Router.argument} and {!val:Router.make}.

The order of entries in [mounts] does not matter because the dispatching is based on longest prefix match. If no match can be found, then it means the unit path refers to a local unit. The same library can be mounted at multiple points. However, to keep the resolution unambiguous, there cannot be two libraries mounted at the same point. Here is an example demonstrating the longest prefix match:

{v
{
"format": "1.0.0",
"mounts": [
"lib": "stdlib",
"lib/bantorra": ["git", {url: "https://github.com/RedPRL/bantorra"}]
]
}
v}

The unit path [world.orntorra] will be routed to the unit [orntorra] within the [world] library, pending further resolution (as the [world] library might further mount other libraries), while the unit path [world.bantorra.shisho] will be routed to [shisho] in the library corresponding to [https://github.com/RedPRL/bantorra], not the unit [bantorra.shisho] in the [world] library.

If some library is mounted at [world.towitorra], then the original unit with the path [world.towitorra] or a path with the prefix [world.towitorra] is no longer accessible. Moreover, [world.towitorra] cannot point to any unit after the mounting because no unit can be associated with the empty path (the root), and [world.towitorra] means the empty path (the root) in the mounted library, which cannot refer to any unit.
29 changes: 29 additions & 0 deletions src/Bantorra.ml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
(** The Bantorra library manager. *)

(** {1 Main Modules} *)

module Manager = Manager
(** Library managers. *)

module Router = Router
(** Routers. *)

module ErrorCode = ErrorCode
(** Error codes. *)

module Error = Error
(** Algebraic effects of error reporting. *)

(** {1 Helper Modules} *)

module UnitPath = UnitPath
(** Unit paths. *)

module FilePath = FilePath
(** Basic path manipulation. *)

module File = File
(** Basic I/O. *)

module Marshal = Marshal
(** JSON Serialization. *)
1 change: 1 addition & 0 deletions src/Error.ml
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
include Asai.Logger.Make(ErrorCode)
1 change: 1 addition & 0 deletions src/Error.mli
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
include Asai.Logger.S with module Code := ErrorCode
26 changes: 26 additions & 0 deletions src/ErrorCode.ml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
type t =
[ `System
| `AnchorNotFound
| `JSONFormat
| `UnitNotFound
| `InvalidLibrary
| `InvalidRoute
| `InvalidRouter
| `Web
]

let default_severity =
function
| `InvalidRouter -> Asai.Severity.Bug
| _ -> Asai.Severity.Error

let to_string : t -> string =
function
| `System -> "sys"
| `AnchorNotFound -> "anchor"
| `JSONFormat -> "json"
| `UnitNotFound -> "unit"
| `InvalidLibrary -> "lib"
| `InvalidRoute -> "route"
| `InvalidRouter -> "router"
| `Web -> "web"
162 changes: 162 additions & 0 deletions src/File.ml
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
module U = Unix
module E = Error
module F = FilePath

(* invariant: absolute path *)
type path = F.t

let (/) = F.add_unit_seg

let wrap_bos =
function
| Ok r -> r
| Error (`Msg msg) -> E.fatalf `System "%s" msg

let get_cwd () = F.of_fpath @@ wrap_bos @@ Bos.OS.Dir.current ()

(** Read the entire file as a string. *)
let read p =
E.tracef "File.read(%a)" (F.pp ~relative_to:(get_cwd())) p @@ fun () ->
wrap_bos @@ Bos.OS.File.read (F.to_fpath p)

(** Write a string to a file. *)
let write p s =
E.tracef "File.write(%a)" (F.pp ~relative_to:(get_cwd())) p @@ fun () ->
wrap_bos @@ Bos.OS.File.write (F.to_fpath p) s

let ensure_dir p =
E.tracef "File.ensure_dir(%a)" (F.pp ~relative_to:(get_cwd())) p @@ fun () ->
ignore @@ wrap_bos @@ Bos.OS.Dir.create (F.to_fpath p)

let file_exists p =
wrap_bos @@ Bos.OS.File.exists (F.to_fpath p)

let locate_anchor ~anchor start_dir =
E.tracef "File.locate_anchor(%s,%a)" anchor (F.pp ~relative_to:(get_cwd())) start_dir @@ fun () ->
let rec go cwd path_acc =
if file_exists (cwd/anchor) then
cwd, UnitPath.of_list path_acc
else
if F.is_root cwd
then E.fatalf `AnchorNotFound "No anchor found all the way up to the root"
else go (F.parent cwd) @@ F.basename cwd :: path_acc
in
go (F.to_dir_path start_dir) []

let locate_hijacking_anchor ~anchor ~root path =
E.tracef "File.hijacking_anchors_exist(%s,%a)" anchor (F.pp ~relative_to:(get_cwd())) root @@ fun () ->
match UnitPath.to_list path with
| [] -> None
| first_seg :: segs ->
let rec loop cwd parts =
if file_exists (cwd/anchor) then
Some cwd
else
match parts with
| [] -> None
| seg :: segs ->
loop (cwd/seg) segs
in
loop (root/first_seg) segs

(** The scheme refers to how various directories should be determined.

It does not correspond to the actual OS that is running. For example, the
[Linux] scheme covers all BSD-like systems and Cygwin on Windows. *)
type scheme = MacOS | Linux | Windows

let uname_s =
lazy begin
Result.to_option @@
Bos.OS.Cmd.(in_null |> run_io Bos.Cmd.(v "uname" % "-s") |> to_string ~trim:true)
end

let guess_scheme =
lazy begin
match Sys.os_type with
| "Unix" ->
begin
match Lazy.force uname_s with
| Some "Darwin" -> MacOS
| _ -> Linux
end
| "Cygwin" -> Linux
| "Win32" -> Windows
| _ -> Linux
end

let get_home () =
F.of_fpath @@ wrap_bos @@ Bos.OS.Dir.user ()

let read_env_path var =
Result.map (F.of_fpath ~relative_to:(get_cwd ())) @@ Bos.OS.Env.path var

(* XXX I did not test the following code on different platforms. *)
let get_xdg_config_home ~app_name =
E.tracef "File.get_xdg_config_home" @@ fun () ->
match read_env_path "XDG_CONFIG_HOME" with
| Ok dir -> dir/app_name
| Error _ ->
match Lazy.force guess_scheme with
| Linux ->
let home =
E.try_with get_home
~fatal:(fun _ -> E.fatalf `System "Both XDG_CONFIG_HOME and HOME are not set")
in
home/".config"/app_name
| MacOS ->
let home =
E.try_with get_home
~fatal:(fun _ -> E.fatalf `System "Both XDG_CONFIG_HOME and HOME are not set")
in
home/"Library"/"Application Support"/app_name
| Windows ->
begin
match read_env_path "APPDATA" with
| Ok app_data ->
app_data/app_name/"config"
| Error _ ->
E.fatalf `System "Both XDG_CONFIG_HOME and APPDATA are not set"
end

(* XXX I did not test the following code on different platforms. *)
let get_xdg_cache_home ~app_name =
E.tracef "File.get_xdg_cache_home" @@ fun () ->
match read_env_path "XDG_CACHE_HOME" with
| Ok dir -> dir/app_name
| Error _ ->
match Lazy.force guess_scheme with
| Linux ->
let home =
E.try_with get_home
~fatal:(fun _ -> E.fatalf `System "Both XDG_CACHE_HOME and HOME are not set")
in
home/".cache"/app_name
| MacOS ->
let home =
E.try_with get_home
~fatal:(fun _ -> E.fatalf `System "Both XDG_CACHE_HOME and HOME are not set")
in
home/"Library"/"Caches"/app_name
| Windows ->
begin
match read_env_path "LOCALAPPDATA" with
| Error _ ->
E.fatalf `System "Both XDG_CACHE_HOME and LOCALAPPDATA are not set"
| Ok local_app_data ->
local_app_data/app_name/"cache"
end

(** OCaml findlib *)

let findlib_init = lazy begin Findlib.init () end

let get_package_dir pkg =
Lazy.force findlib_init;
try
FilePath.of_string @@ Findlib.package_directory pkg
with
| Findlib.No_such_package (pkg, msg) ->
E.fatalf `System "No package named %s: %s" pkg msg
| Findlib.Package_loop pkg ->
E.fatalf `System "Package %s required by itself" pkg
67 changes: 67 additions & 0 deletions src/File.mli
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
(** {1 Path types} *)

type path = FilePath.t

(** {1 Basic I/O} *)

val read : path -> string
(**
[read path] reads the content of string [str] the file at [path] (in binary mode).
If there was already a file at [path], it will be overwritten.
*)

val write : path -> string -> unit
(**
[write path str] writes the string [str] the file at [path] (in binary mode).
If there was already a file at [path], it will be overwritten.
*)

(** {1 Directories} *)

val get_cwd : unit -> path
(**
[get_cwd ()] returns the current working directory.
*)

val ensure_dir : path -> unit
(**
[ensure_dir dir] effectively implements [mkdir -p dir] in OCaml.
*)

(** {1 Locating Files} *)

val file_exists : path -> bool
(**
[file_exists file] checks whether [file] is a regular file.
*)

val locate_anchor : anchor:string -> path -> path * UnitPath.t
(**
[locate_anchor ~anchor dir] finds the closest regular file named [anchor] in [dir] or its ancestors in the file system tree.

@param dir The starting directory.

@return
(1) the first directory that holds a regular file named [anchor] on the way from [dir] to the root directory; and (2) the relative path from the returned directory to [dir].

For example, on a typical Linux system, suppose there is no file called [anchor.txt] under directiors
[/usr/lib/gcc/] and [/usr/lib/], but there is such a file under [/usr/].
[locate_anchor ~anchor:"anchor.txt" "/usr/lib/gcc"] will return ["/usr/", ["lib"; "gcc"]]
and [locate_anchor ~anchor:"anchor.txt" "/usr/"] will return ["/usr/", []].
*)

val locate_hijacking_anchor : anchor:string -> root:path -> UnitPath.t -> path option

(** {1 Special Directories} *)

val get_home : unit -> path

val get_xdg_config_home : app_name:string -> path
(** Get the per-user config directory based on [XDG_CONFIG_HOME]
with reasonable default values on major platforms. *)

val get_xdg_cache_home : app_name:string -> path
(** Get the per-user persistent cache directory based on [XDG_CACHE_HOME]
with reasonable default values on major platforms. *)

val get_package_dir : string -> path
Loading