-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathTODO
202 lines (177 loc) · 10.3 KB
/
TODO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
Usability:
* change bootstrap mechanism: generate pstate image instead of using embedbins
- removes uinit from kernel
- use same process to install HiStar onto a disk
- demand-page initial binaries from install CD/USB drive
* clean up i386: gcc -m32 by default
* performance: shared library prelinking
* standardize on a reasonable FS layout
* linux compat mode: run vmlinux instead of our unix emulation library
- catch pagefault, syscall traps, vector them to the vmlinux idt
- for sharing, implement "histarfs" in linux that uses our existing
for a labeled file system.
* multi-referenced containers
- what happens to parent pointers? disable for multi-refable containers..
- depth count, with inf value? not so composable, but decent.
* bootstrapping
- disk device object:
+ disk_create(int index)
+ disk_identify(struct disk_id) -> bus, device#, manuf, serial, size
+ disk_{read,write}_buf() -> read or write sector buffer for disk device
+ disk_cmd(uint64_t offset) -> transfer data between disk and sector buffer
+ disk_status() -> command active/completed
- pstate=bootstrap enables disk devices, does not write pstate to disk.
+ initialize partition table, write a kernel-only boot file to disk
- sys_pstate_enable() disables disk device access, writes pstate to disk.
+ writes the user-level persistent state to disk for later use.
* dstar-style labels: split information flow label from privilege label
(stars and clearance). optimize labels as hashed sets of 64-bit values.
* privilege objects: effectively a kobj_label, with one information flow
label protecting the privilege, and a privilege label storing privilege.
extend cobj_ref to hold a container entry of the privilege label to use,
or 0 for a thread's local privilege.
* process/page table objects: share page table based on a common privilege
object. need to also agree on the information flow label used to fill
in the page table. page table has container entry of address space object,
which also specifies the privilege object used to fill in the page table.
* optimize container entry lookups by hashing containers?
* multi-core support: scheduling, concurrency control at object level in
kernel, locality mechanisms
HW support:
* AHCI performance problems
* e1000 autoneg -- 10mbps works, 100mbps does not?
Compatibility issues:
* implement libc-tsd for correct multi-threaded libc-rpc
* MT-safe c++ support (lib/cppsup/bits/gthr.h)
Performance:
* deal with fragmentation on-disk
* partial segment swapin and swapout
* page reclamation, LRU (unpin callbacks for reclaiming space?)
* update lwip (http://pdos.csail.mit.edu/~sbw/lwip/)
Kernel bugs:
* reuse flushed but not committed blocks in the on-disk log for the
same disk address when writing pages to the log; otherwise a page
can be written to the log multiple times in a single transaction,
running the system out of log space. need to track commits in
the log code to do this (currently not tracked).
* bugs in handling out-of-memory conditions [low priority for now]:
- freelist/dstack: free_later list using a btree
- implement log.disk_map using a btree
- larger FRM, stick FRM contents in a separate disk block
* do something about uint128_t on i386 -- it's not there natively
* kobject_get_page could fail when out of memory with copy-on-write;
get rid of assertions that it succeeds (e.g. saving FP state -- get
the FP page before running the kernel thread, if not enough memory
then suspend the thread until it can allocate memory.)
* do the right things happen with th_as caching? thread lowers label?
* page_info::pi_parent is no longer necessary; use pi_plist instead
Security:
* netd_server doesn't do any checking on who's asking for what fd.
socket() should grant a handle at * to the caller that will let
them later do other socket operations.
* taint_ct in gate_call/gate_return should be labeled with gate-call handles.
* library call to run with a subset of your privileges, for doing some
operation on behalf of the caller (e.g. in the declassifier gate).
* should be more careful in dealing with segments passed in via gates; namely,
need to avoid page faults, and ensure that we access them only with caller's
privileges (verify label). can arise for two reasons: first, we map the
segment into our AS but don't have the privilege to read or write to it.
this can be fixed by segment_map() checking the label. second, the other
side can drop the segment and the service (other side of gate) will fault
because the segment is gone. for this, need either to copy all segments
first, or to have a careful-read and careful-write wrapper that does
setjmp() first, and the segfault handler will longjmp().
* force checkpoint every N handle allocations, and roll forward handle counter by
N after restore, for some large value N.
* real randomness...
* system calls should be atomic (all-or-nothing), otherwise we have a covert
storage channel when you can tell a system call executed partway and suspended
when it hit something. the culprits are variable-length data structures -- AS
and labels -- sys_as_get, sys_as_set, and anything that calls label_to_ulabel:
sys_obj_get_label, sys_self_get_{clearance,verify}, sys_gate_clearance.
* probably should scrub state more often across gate calls: gate_call_data
passed to gate_call::call() and gatesrv_return::ret(), and TLS stack (mask
utraps before doing this, to avoid race conditions). look at lib/authclnt.cc.
* what happens when gate caller doesn't grant all of the gate's own privileges?
maybe change sys_gate_enter: caller has to grant all stars that the gate has?
a lot of code already assumes this, and would have to check whether we lost
any stars on every gate entry in user-space.. is it expensive to do this in
user-space?
* you can provide an arbitrary gate for a gate-based service to return to, and
gatesrv does no checks to make sure the caller could have really invoked it
in the first place. should fetch the return gate's verify label VL_RG and
check that VL_T \leq VL_RG.
* threads can detect reboot using SIDT, SGDT, SLDT, STR?
place any such global structs at deterministic memory locations in kernel.
* can use CPUID, %eflags[ID] to detect hardware changes
* should disable RDTSC from user-level, except we use it for profiling now..
what to do about high-precision time in general?
* extend the scheduling quantum when we take a hardware interrupt? to allow
ssl_eprocd to finish its RSA operation without preemption.
* does rflags.RF allow user-space to detect kernel faults? this bit is
always cleared by IRET.
Cleanup:
* better separation between kernel, user-space and shared header files
* user-space resource allocators -- i.e. running without CT_QUOTA_INF
* (?) expose kobj_label's at the syscall API to get label sharing
* lwip: fix sys_arch_timeouts() in lib/lwip/jos64/arch/sys_arch.c;
need a callback for when threads enter/leave an AS.
* sys_container_get_parent(): check thread can read parent (for later chown)
* potentially unify spawn & gate invocation. spawn could be split into two
parts: (1) create new address space and a gate vectoring into _start, and
(2) invoke said gate with whatever privileges, arguments, etc that we want.
* provide resource-clean abstractions for starting a thread on TLS,
and jumping through a gate without leaving any resources behind.
* enable __UCLIBC_DYNAMIC_ATEXIT__; the problem with it is that global objects
get registered from _init, at which point there isn't a start_env, which
makes dynamic allocation difficult.
Bugs:
* netd bug (fetch'ing files hangs for some files but not others)
* look for concurrency bugs using test harness (make disk ops async)
* netd connect() does not honor O_NONBLOCK
Larger issues:
* system management: kernel upgrades? system install? partitioning a disk?
* backup/restore
* system management in a fully-persistent world:
- restarting services with handles and gates
- upgrades (signing binaries => integrity handles?)
* expose CPU resource allocation: track resources in container entries and
re-compute a thread's th_sched_tickets when it changes its label to only
include resources from containers it can read (set_sched_parents).
associate scheduling type (time-based, instruction-based) and cache
clearing flag with container.
* Linux in histar userspace to replace LWIP with Linux's TCP stack.
* port SANE switches to HiStar (Ed)
* "decentralized SANE" to reduce covert channels in distributed HiStar
* trusted path for voting machines? port python to HiStar for voting software.
New hardware support:
* hyperthreading/multicore: use monitor/mwait or pause/spinwait for
fast sys-/vmm-calls.
* debug registers: use call tracing for kernel debugging?
intel CPUs have 16-slot buffers it seems, while AMD traps on every jmp.
* iommu: device drivers into user-mode. some more thoughts on hardcopy paper.
not so easy due to overly-trusting PCI-PCI bridges. it may be OK to have
one "device driver" protection domain for most PCI devices, except that
the disk is a fully-trusted device. if we can ensure that bad PCI cards
cannot take over the disk, then we may be OK. also handle NMI.
* cpu virt: containerize SMM (do any desktops use SMM much? usb kbd virt?)
* cpu virt: move parts of the current CPL=0 kernel into a VMM ("ring -1")
and make the CPL=0 code less trusted.
* cpu virt: with nested page tables, implement a "vmrun" system call that
will allow us to run a VM in a HiStar thread. for devices, use qemu
which was ported to do the same thing for Linux's KVM.
* use ata "nv cache" (flash memory) feature set to speed up disk sync.
Miscellaneous:
===
multiple threads that have different labels in the same AS is a little
strange. when you want to give an fd to another thread, you have to
grant it fd handles at *; similarly, when you close an fd, you should
make sure no other thread has those handles at * (otherwise they'll
never get GCed from the label). maybe it makes sense to have some
sort of label shared by threads that goes along with an AS?
for discretionary handles (at *) we could implement this at user-space,
by using lib/privstore.cc to stash away/drop privilege. for handles
at non-* levels, this might be a non-issue?
possible answer: store all stars in a privstore; stars in thread label
are a cache. how to catch -E_LABEL errors and how to fetch the right
handle?