title |
---|
Files and Directories |
We'll see here a handful of functions and libraries to operate on files and directories.
In this chapter, we use mainly namestrings to specify filenames. In a recipe or two we also use pathnames.
Many functions will come from UIOP, so we suggest you have a look directly at it:
Of course, do not miss:
Use file-namestring
to get a file name from a pathname:
(file-namestring #p"/path/to/file.lisp") ;; => "file.lisp"
The file extension is called "pathname type" in Lisp parlance:
(pathname-type "~/foo.org") ;; => "org"
The basename is called the "pathname name" -
(pathname-name "~/foo.org") ;; => "foo"
(pathname-name "~/foo") ;; => "foo"
If a directory pathname has a trailing slash, pathname-name
may return nil
; use pathname-directory
instead -
(pathname-name "~/foo/") ;; => NIL
(first (last (pathname-directory #P"~/foo/"))) ;; => "foo"
(uiop:pathname-parent-directory-pathname #P"/foo/bar/quux/")
;; => #P"/foo/bar/"
Use the function
probe-file
which will return a
generalized boolean -
either nil
if the file doesn't exists, or its
truename
(which might be different from the argument you supplied).
For more portability, use uiop:probe-file*
or uiop:file-exists-p
which will return the file pathname (if it exists).
$ ln -s /etc/passwd foo
* (probe-file "/etc/passwd")
#p"/etc/passwd"
* (probe-file "foo")
#p"/etc/passwd"
* (probe-file "bar")
NIL
For portability, use uiop:native-namestring
:
(uiop:native-namestring "~/.emacs.d/")
"/home/me/.emacs.d/"
It also expand the tilde with files and directories that don't exist:
(uiop:native-namestring "~/foo987.txt")
:: "/home/me/foo987.txt"
On several implementations (CCL, ABCL, ECL, CLISP, LispWorks),
namestring
works similarly. On SBCL, if the file or directory
doesn't exist, namestring
doesn't expand the path but returns the
argument, with the tilde.
With files that exist, you can also use truename
. But, at least on
SBCL, it returns an error if the path doesn't exist.
Use again uiop:native-namestring
:
CL-USER> (uiop:native-namestring #p"~/foo/")
"C:\\Users\\You\\foo\\"
See also uiop:parse-native-namestring
for the inverse operation.
The function ensure-directories-exist creates the directories if they do not exist:
(ensure-directories-exist "foo/bar/baz/")
This may create foo
, bar
and baz
. Don't forget the trailing slash.
Use uiop:delete-directory-tree
with a pathname (#p
), a trailing slash and the :validate
key:
;; mkdir dirtest
(uiop:delete-directory-tree #p"dirtest/" :validate t)
You can use pathname
around a string that designates a directory:
(defun rmdir (path)
(uiop:delete-directory-tree (pathname path) :validate t))
UIOP also has delete-empty-directory
cl-fad has (fad:delete-directory-and-files "dirtest")
.
Use merge-pathnames
, with one thing to note: if you want to append
directories, the second argument must have a trailing /
.
As always, look at UIOP functions. We have a uiop:merge-pathnames*
equivalent which fixes corner cases.
So, here's how to append a directory to another one:
(merge-pathnames "otherpath" "/home/vince/projects/")
;; important: ^^
;; a trailing / denotes a directory.
;; => #P"/home/vince/projects/otherpath"
Look at the difference: if you don't include a trailing slash to
either paths, otherpath
and projects
are seen as files, so otherpath
is appended to the base directory containing projects
:
(merge-pathnames "otherpath" "/home/vince/projects")
;; #P"/home/vince/otherpath"
;; ^^ no "projects", because it was seen as a file.
or again, with otherpath/
(a trailing /
) but projects
seen as a file:
(merge-pathnames "otherpath/" "/home/vince/projects")
;; #P"/home/vince/otherpath/projects"
;; ^^ inserted here
Use uiop/os:getcwd
:
(uiop/os:getcwd)
;; #P"/home/vince/projects/cl-cookbook/"
;; ^ with a trailing slash, useful for merge-pathnames
Use asdf:system-relative-pathname system path
.
Say you are working inside mysystem
. It has an ASDF system
declaration, the system is loaded in your Lisp image. This ASDF file
is somewhere on your filesystem and you want the path to src/web/
. Do this:
(asdf:system-relative-pathname "mysystem" "src/web/")
;; => #P"/home/vince/projects/mysystem/src/web/"
This will work on another user's machine, where the system sources are located in another location.
Use uiop:chdir
path
:
(uiop:chdir "/bin/")
0
The trailing slash in path is optional.
Or, to set for the current directory for the next operation only, use uiop:with-current-directory
:
(let ((dir "/path/to/another/directory/"))
(uiop:with-current-directory (dir)
(directory-files "./")))
Common Lisp has
open
and
close
functions which resemble the functions of the same denominator from other
programming languages you're probably familiar with. However, it is almost
always recommendable to use the macro
with-open-file
instead. Not only will this macro open the file for you and close it when you're
done, it'll also take care of it if your code leaves the body abnormally (such
as by a use of
go
,
return-from
,
or throw
). A
typical use of with-open-file
looks like this:
(with-open-file (str <_file-spec_>
:direction <_direction_>
:if-exists <_if-exists_>
:if-does-not-exist <_if-does-not-exist_>)
(your code here))
str
is a variable which'll be bound to the stream which is created by opening the file.<_file-spec_>
will be a truename or a pathname.<_direction_>
is usually:input
(meaning you want to read from the file),:output
(meaning you want to write to the file) or:io
(which is for reading and writing at the same time) - the default is:input
.<_if-exists_>
specifies what to do if you want to open a file for writing and a file with that name already exists - this option is ignored if you just want to read from the file. The default is:error
which means that an error is signalled. Other useful options are:supersede
(meaning that the new file will replace the old one),:append
(content is added to the file),nil
(the stream variable will be bound tonil
), and:rename
(i.e. the old file is renamed).<_if-does-not-exist_>
specifies what to do if the file you want to open does not exist. It is one of:error
for signalling an error,:create
for creating an empty file, ornil
for binding the stream variable tonil
. The default is, to be brief, to do the right thing depending on the other options you provided. See the CLHS for details.
Note that there are a lot more options to with-open-file
. See
the CLHS entry for open
for all the details. You'll find some examples on how to use with-open-file
below. Also note that you usually don't need to provide any keyword arguments if
you just want to open an existing file for reading.
It's quite common to need to access the contents of a file in string form, or to get a list of lines.
uiop is included in ASDF (there is no extra library to install or system to load) and has the following functions:
(uiop:read-file-string "file.txt")
and
(uiop:read-file-lines "file.txt")
Otherwise, this can be achieved by using read-line
or read-char
functions,
that probably won't be the best solution. The file might not be divided into
multiple lines or reading one character at a time might bring significant
performance problems. To solve this problems, you can read files using buckets
of specific sizes.
(with-output-to-string (out)
(with-open-file (in "/path/to/big/file")
(loop with buffer = (make-array 8192 :element-type 'character)
for n-characters = (read-sequence buffer in)
while (< 0 n-characters)
do (write-sequence buffer out :start 0 :end n-characters)))))
Furthermore, you're free to change the format of the read/written data, instead
of using elements of type character every time. For instance, you can set
:element-type
type argument of with-output-to-string
, with-open-file
and
make-array
functions to '(unsigned-byte 8)
to read data in octets.
To avoid an ASCII stream decoding error
you might want to specify an UTF-8 encoding:
(with-open-file (in "/path/to/big/file"
:external-format :utf-8)
...
Sometimes you don't control the internals of a library, so you'd
better set the default encoding to utf-8. Add this line to your
~/.sbclrc
:
(setf sb-impl::*default-external-format* :utf-8)
and optionally
(setf sb-alien::*default-c-string-external-format* :utf-8)
read-line
will read one line from a stream (which defaults to
standard input)
the end of which is determined by either a newline character or the end of the
file. It will return this line as a string without the trailing newline
character. (Note that read-line
has a second return value which is true if there
was no trailing newline, i.e. if the line was terminated by the end of the
file.) read-line
will by default signal an error if the end of the file is
reached. You can inhibit this by supplying NIL as the second argument. If you do
this, read-line
will return nil
if it reaches the end of the file.
(with-open-file (stream "/etc/passwd")
(do ((line (read-line stream nil)
(read-line stream nil)))
((null line))
(print line)))
You can also supply a third argument which will be used instead of nil
to signal
the end of the file:
(with-open-file (stream "/etc/passwd")
(loop for line = (read-line stream nil 'foo)
until (eq line 'foo)
do (print line)))
read-char
is similar to read-line
, but it only reads one character as opposed to one
line. Of course, newline characters aren't treated differently from other
characters by this function.
(with-open-file (stream "/etc/passwd")
(do ((char (read-char stream nil)
(read-char stream nil)))
((null char))
(print char)))
You can 'look at' the next character of a stream without actually removing it
from there - this is what the function
peek-char
is for. It can be used for three different purposes depending on its first
(optional) argument (the second one being the stream it reads from): If the
first argument is nil
, peek-char
will just return the next character that's
waiting on the stream:
CL-USER> (with-input-from-string (stream "I'm not amused")
(print (read-char stream))
(print (peek-char nil stream))
(print (read-char stream))
(values))
#\I
#\'
#\'
If the first argument is T
, peek-char
will skip
whitespace
characters, i.e. it will return the next non-whitespace character that's waiting
on the stream. The whitespace characters will vanish from the stream as if they
had been read by read-char
:
CL-USER> (with-input-from-string (stream "I'm not amused")
(print (read-char stream))
(print (read-char stream))
(print (read-char stream))
(print (peek-char t stream))
(print (read-char stream))
(print (read-char stream))
(values))
#\I
#\'
#\m
#\n
#\n
#\o
If the first argument to peek-char
is a character, the function will skip all
characters until that particular character is found:
CL-USER> (with-input-from-string (stream "I'm not amused")
(print (read-char stream))
(print (peek-char #\a stream))
(print (read-char stream))
(print (read-char stream))
(values))
#\I
#\a
#\a
#\m
Note that peek-char
has further optional arguments to control its behaviour on
end-of-file similar to those for read-line
and read-char
(and it will signal an
error by default):
CL-USER> (with-input-from-string (stream "I'm not amused")
(print (read-char stream))
(print (peek-char #\d stream))
(print (read-char stream))
(print (peek-char nil stream nil 'the-end))
(values))
#\I
#\d
#\d
THE-END
You can also put one character back onto the stream with the function
unread-char
. You
can use it as if, after you have read a character, you decide that you'd
better used peek-char
instead of read-char
:
CL-USER> (with-input-from-string (stream "I'm not amused")
(let ((c (read-char stream)))
(print c)
(unread-char c stream)
(print (read-char stream))
(values)))
#\I
#\I
Note that the front of a stream doesn't behave like a stack: You can only put back exactly one character onto the stream. Also, you must put back the same character that has been read previously, and you can't unread a character if none has been read before.
Use the function
file-position
for random access to a file. If this function is used with one argument (a
stream), it will return the current position within the stream. If it's used
with two arguments (see below), it will actually change the
file position
in the stream.
CL-USER> (with-input-from-string (stream "I'm not amused")
(print (file-position stream))
(print (read-char stream))
(print (file-position stream))
(file-position stream 4)
(print (file-position stream))
(print (read-char stream))
(print (file-position stream))
(values))
0
#\I
1
4
#\n
5
With with-open-file
, specify :direction :output
and use write-sequence
inside:
(with-open-file (f <pathname> :direction :output
:if-exists :supersede
:if-does-not-exist :create)
(write-sequence s f))
If the file exists, you can also :append
content to it.
If it doesn't exist, you can :error
out. See the standard for more details.
The library Alexandria has a function called write-string-into-file
(alexandria:write-string-into-file content "file.txt")
Alternatively, the library str has the to-file
function.
(str:to-file "file.txt" content) ;; with optional options
Both alexandria:write-string-into-file
and str:to-file
take the same keyword arguments as cl:open
that controls file creation: :if-exists
and if-does-not-exists
.
Osicat
is a lightweight operating system interface for Common Lisp on
POSIX-like systems, including Windows. With Osicat we can get and set
environment variables (now doable with uiop:getenv
),
manipulate files and directories,
pathnames and a bit more.
file-attributes is a newer and lighter OS portability library specifically for getting file attributes, using system calls (cffi).
SBCL with its sb-posix
contrib can be used too.
Once Osicat is installed, it also defines the osicat-posix
system,
which permits us to get file attributes.
(ql:quickload "osicat")
(let ((stat (osicat-posix:stat #P"./files.md")))
(osicat-posix:stat-size stat)) ;; => 10629
We can get the other attributes with the following methods:
osicat-posix:stat-dev
osicat-posix:stat-gid
osicat-posix:stat-ino
osicat-posix:stat-uid
osicat-posix:stat-mode
osicat-posix:stat-rdev
osicat-posix:stat-size
osicat-posix:stat-atime
osicat-posix:stat-ctime
osicat-posix:stat-mtime
osicat-posix:stat-nlink
osicat-posix:stat-blocks
osicat-posix:stat-blksize
Install the library with
(ql:quickload "file-attributes")
Its package is org.shirakumo.file-attributes
. You can use a
package-local nickname for a shorter access to its functions, for example:
(uiop:add-package-local-nickname :file-attributes :org.shirakumo.file-attributes)
Then simply use the functions:
access-time
,modification-time
,creation-time
. You cansetf
them.owner
,group
, andattributes
. The values used are OS specific for these functions. The attributes flag can be decoded and encoded via a standardised form withdecode-attributes
andencode-attributes
.
CL-USER> (file-attributes:decode-attributes
(file-attributes:attributes #p"test.txt"))
(:READ-ONLY NIL :HIDDEN NIL :SYSTEM-FILE NIL :DIRECTORY NIL :ARCHIVED T :DEVICE
NIL :NORMAL NIL :TEMPORARY NIL :SPARSE NIL :LINK NIL :COMPRESSED NIL :OFFLINE
NIL :NOT-INDEXED NIL :ENCRYPTED NIL :INTEGRITY NIL :VIRTUAL NIL :NO-SCRUB NIL
:RECALL NIL)
See its documentation.
This contrib is loaded by default on POSIX systems.
First get a stat object for a file, then get the stat you want:
CL-USER> (sb-posix:stat "test.txt")
#<SB-POSIX:STAT {10053FCBE3}>
CL-USER> (sb-posix:stat-mtime *)
1686671405
Some functions below return pathnames, so you might need the following:
(namestring #p"/foo/bar/baz.txt") ==> "/foo/bar/baz.txt"
(directory-namestring #p"/foo/bar/baz.txt") ==> "/foo/bar/"
(file-namestring #p"/foo/bar/baz.txt") ==> "baz.txt"
(uiop:directory-files "./")
Returns a list of pathnames:
(#P"/home/vince/projects/cl-cookbook/.emacs"
#P"/home/vince/projects/cl-cookbook/.gitignore"
#P"/home/vince/projects/cl-cookbook/AppendixA.jpg"
#P"/home/vince/projects/cl-cookbook/AppendixB.jpg"
#P"/home/vince/projects/cl-cookbook/AppendixC.jpg"
#P"/home/vince/projects/cl-cookbook/CHANGELOG"
#P"/home/vince/projects/cl-cookbook/CONTRIBUTING.md"
[…]
(uiop:subdirectories "./")
(#P"/home/vince/projects/cl-cookbook/.git/"
#P"/home/vince/projects/cl-cookbook/.sass-cache/"
#P"/home/vince/projects/cl-cookbook/_includes/"
#P"/home/vince/projects/cl-cookbook/_layouts/"
#P"/home/vince/projects/cl-cookbook/_site/"
#P"/home/vince/projects/cl-cookbook/assets/")
In addition to the above functions, we mention solutions that lazily traverse a directory. They don't load the entire list of files before returning it.
Osicat has with-directory-iterator
:
(with-directory-iterator (next "/")
(loop for entry = (next)
while entry
when (member :group-write (file-permissions entry))
collect entry))
;; => (#P"tmp/")
LispWorks has the fast-directory-files function, and AllegroCL has map-over-directory.
See uiop/filesystem:collect-sub*directories
. It takes as arguments:
- a
directory
- a
collectp
function - a
recursep
function - a
collector
function
Given a directory, when collectp
returns true with the directory,
call the collector
function on the directory, and recurse
each of its subdirectories on which recursep
returns true.
This function will thus let you traverse a filesystem hierarchy,
superseding the functionality of cl-fad:walk-directory
.
The behavior in presence of symlinks is not portable. Use IOlib to handle such situations.
Examples:
- this collects only subdirectories:
(defparameter *dirs* nil "All recursive directories.")
(uiop:collect-sub*directories "~/cl-cookbook"
(constantly t)
(constantly t)
(lambda (it) (push it *dirs*)))
- this collects files and subdirectories:
(let ((results))
(uiop:collect-sub*directories
"./"
(constantly t)
(constantly t)
(lambda (subdir)
(setf results
(nconc results
;; A detail: we return strings, not pathnames.
(loop for path in (append (uiop:subdirectories subdir)
(uiop:directory-files subdir))
collect (namestring path))))))
results)
- we can do the same with the
cl-fad
library:
(cl-fad:walk-directory "./"
(lambda (name)
(format t "~A~%" name))
:directories t)
- and of course, we can use an external tool: the good ol' unix
find
, or the newerfd
(fdfind
on Debian) that has a simpler syntax and filters out a set of common files and directories by default (node_modules, .git…):
(str:lines (uiop:run-program (list "find" ".") :output :string))
;; or
(str:lines (uiop:run-program (list "fdfind") :output :string))
Here with the help of the str
library.
Below we simply list files of a directory and check that their name contains a given string.
(remove-if-not (lambda (it)
(search "App" (namestring it)))
(uiop:directory-files "./"))
(#P"/home/vince/projects/cl-cookbook/AppendixA.jpg"
#P"/home/vince/projects/cl-cookbook/AppendixB.jpg"
#P"/home/vince/projects/cl-cookbook/AppendixC.jpg")
We used namestring
to convert a pathname
to a string, thus a
sequence that search
can deal with.
We can not transpose unix wildcards to portable Common Lisp.
In pathname strings we can use *
and **
as wildcards. This works
in absolute and relative pathnames.
(directory #P"*.jpg")
(directory #P"**/*.png")
The concept of .
denoting the current directory does not exist in
portable Common Lisp. This may exist in specific filesystems and
specific implementations.
Also ~
to denote the home directory does not exist. They may be
recognized by some implementations as non-portable extensions.
*default-pathname-defaults*
provides a default for some pathname
operations.
(let ((*default-pathname-defaults* (pathname "/bin/")))
(directory "*sh"))
(#P"/bin/zsh" #P"/bin/tcsh" #P"/bin/sh" #P"/bin/ksh" #P"/bin/csh" #P"/bin/bash")
See also (user-homedir-pathname)
.