Specification » History » Version 5
Alexander Blum, 03/08/2023 03:03 AM
1 | 1 | Alexander Blum | {{toc}} |
---|---|---|---|
2 | 1 | Alexander Blum | |
3 | 1 | Alexander Blum | # General |
4 | 1 | Alexander Blum | |
5 | 1 | Alexander Blum | ## Object Processing |
6 | 1 | Alexander Blum | |
7 | 1 | Alexander Blum | ### Commits |
8 | 1 | Alexander Blum | |
9 | 1 | Alexander Blum | * Web users provide the system with metadata on objects as information basis for the distribution of money |
10 | 1 | Alexander Blum | * The system needs to balance the need of web users to change the metadata and the stability of the information basis |
11 | 1 | Alexander Blum | * Artists, Releases, Creations have to be committed by the web users as a conscious act |
12 | 1 | Alexander Blum | * No data is published before a commit |
13 | 1 | Alexander Blum | * Web users are able to distinguish public/private fields after a commit |
14 | 1 | Alexander Blum | * A commit of a set of objects (all fields, public fields) is saved as rendered text for evidence |
15 | 1 | Alexander Blum | * Creations do not need to have content to be committable |
16 | 1 | Alexander Blum | * Content with a relation to a Creation is autocommitted, when the creation is committed |
17 | 1 | Alexander Blum | * Before a commit, web users may edit the object freely |
18 | 1 | Alexander Blum | * Committing an object implies committing all objects down the hierarchy (see [[Workflows#Licenser-Cascades|Cascades]]) |
19 | 1 | Alexander Blum | * After a commit, web users may |
20 | 1 | Alexander Blum | * edit only data not relevant for the distribution or object to frequent changes: |
21 | 1 | Alexander Blum | * adding/removing members of an Artist |
22 | 1 | Alexander Blum | * adding a Release to a Creation |
23 | 1 | Alexander Blum | * changing Release metadata |
24 | 1 | Alexander Blum | * trigger a dispute request for changing relevant data: |
25 | 1 | Alexander Blum | * deleting an object (implies creation of duplicate for references) |
26 | 1 | Alexander Blum | * adding/removing a contributor to a Creation |
27 | 1 | Alexander Blum | * adding/removing a original/derivative Creation |
28 | 1 | Alexander Blum | * adding/removing licenses to Releases of Creations |
29 | 1 | Alexander Blum | * adding/removing the Content assigned to a Creation |
30 | 1 | Alexander Blum | * removing a Release from a Creation |
31 | 1 | Alexander Blum | * After a commit, administrators may |
32 | 1 | Alexander Blum | * revise the object(s) |
33 | 1 | Alexander Blum | * reject the object(s) |
34 | 1 | Alexander Blum | * A successful claim dispute (see [[Workflows#Commit]]) causes the object to be uncommitted |
35 | 1 | Alexander Blum | * Uncommitted objects should be recommitted again as soon as possible by the web user |
36 | 1 | Alexander Blum | * if the artist name or creation title is changed on recommit, all admins of referencing objects are informed via email/webinterface |
37 | 1 | Alexander Blum | * if a referenced object is deleted before recommit, a duplicate for all references is created and referenced instead |
38 | 1 | Alexander Blum | * Created / Uncommitted objects |
39 | 1 | Alexander Blum | * are promoted on the web user dashboard |
40 | 1 | Alexander Blum | * excluded from searches (via add/list/etc), except |
41 | 1 | Alexander Blum | * web users can add own uncommitted objects to other own uncommitted objects |
42 | 1 | Alexander Blum | * web users can see but not add own uncommitted objects to other committed objects |
43 | 1 | Alexander Blum | |
44 | 1 | Alexander Blum | ### Disputes |
45 | 1 | Alexander Blum | |
46 | 1 | Alexander Blum | * A dispute is the process of mediation in a conflict between web users |
47 | 1 | Alexander Blum | * A dispute has a |
48 | 1 | Alexander Blum | * code: Sequence |
49 | 1 | Alexander Blum | * state: Selection (requested, assigned, resolved) |
50 | 1 | Alexander Blum | * case: Selection (list of usecases?) |
51 | 1 | Alexander Blum | * object: Reference (to the disputed objects) |
52 | 1 | Alexander Blum | * assignee: User |
53 | 1 | Alexander Blum | * request_party: Party |
54 | 1 | Alexander Blum | * request_text: Text (user statement) |
55 | 1 | Alexander Blum | * request_time: DateTime |
56 | 1 | Alexander Blum | * resolved_time: DateTime |
57 | 1 | Alexander Blum | * comments: Comment (many comments by administrators) |
58 | 1 | Alexander Blum | * A dispute comment has a |
59 | 1 | Alexander Blum | * dispute: Dispute |
60 | 1 | Alexander Blum | * text: Text |
61 | 1 | Alexander Blum | * time: DateTime |
62 | 1 | Alexander Blum | * A dispute may be triggered by a web user on several occasions |
63 | 1 | Alexander Blum | * The web user [[Specification#Claims|claims]] ownership of an already claimed object |
64 | 1 | Alexander Blum | * The web user claims authorship of a Content marked as a duplicate |
65 | 1 | Alexander Blum | * The web user requests a change/deletion of [[Specification#Commits|commited]] content |
66 | 1 | Alexander Blum | * A dispute can be requested, assigned and resolved (see [[Workflows#Dispute]]) |
67 | 1 | Alexander Blum | * A dispute is requested via a dispute form including |
68 | 1 | Alexander Blum | * The disputed object |
69 | 1 | Alexander Blum | * The issue category |
70 | 1 | Alexander Blum | * The user motivation |
71 | 1 | Alexander Blum | * A dispute is handled by an assignee |
72 | 1 | Alexander Blum | * The assignee can add many comments |
73 | 1 | Alexander Blum | * The assignee can mark the dispute as resolved |
74 | 1 | Alexander Blum | |
75 | 1 | Alexander Blum | # Licensing |
76 | 4 | Alexander Blum | |
77 | 4 | Alexander Blum | ## Tariff System |
78 | 4 | Alexander Blum | |
79 | 4 | Alexander Blum | * A creation can have several tariff categories represented by different collecting societies |
80 | 1 | Alexander Blum | |
81 | 1 | Alexander Blum | ## API |
82 | 1 | Alexander Blum | |
83 | 5 | Alexander Blum | * TODO |
84 | 5 | Alexander Blum | |
85 | 5 | Alexander Blum | ## Collection |
86 | 5 | Alexander Blum | |
87 | 5 | Alexander Blum | ## Allocation |
88 | 5 | Alexander Blum | * Invoice |
89 | 5 | Alexander Blum | |
90 | 5 | Alexander Blum | ## Declaration |
91 | 5 | Alexander Blum | |
92 | 5 | Alexander Blum | ## Utilisation |
93 | 5 | Alexander Blum | * Tariff |
94 | 5 | Alexander Blum | * UtilisationCreationList ([[Usecases#Examples]]) |
95 | 5 | Alexander Blum | * UtilisationIndicator (Base, Relevance, Adjustments) |
96 | 5 | Alexander Blum | * ContextIndicator |
97 | 5 | Alexander Blum | * ... |
98 | 5 | Alexander Blum | |
99 | 5 | Alexander Blum | ## Indicators |
100 | 5 | Alexander Blum | |
101 | 5 | Alexander Blum | * UtilisationIndicator |
102 | 5 | Alexander Blum | * ContextIndicators (for each tariff) |
103 | 5 | Alexander Blum | |
104 | 1 | Alexander Blum | # Repertoire |
105 | 1 | Alexander Blum | |
106 | 2 | Alexander Blum | ## Rights Management |
107 | 1 | Alexander Blum | |
108 | 3 | Alexander Blum | * A rightsholder |
109 | 2 | Alexander Blum | * must hold a right |
110 | 2 | Alexander Blum | * Copyright |
111 | 2 | Alexander Blum | * Ancillary Copyright |
112 | 2 | Alexander Blum | * must have an object to which the right belongs |
113 | 2 | Alexander Blum | * Creation |
114 | 2 | Alexander Blum | * Release |
115 | 2 | Alexander Blum | * must have a contribution depending on the object and right |
116 | 2 | Alexander Blum | * Creation |
117 | 2 | Alexander Blum | * Copyright |
118 | 2 | Alexander Blum | * Lyrics |
119 | 2 | Alexander Blum | * Composition |
120 | 2 | Alexander Blum | * Ancillary Copyright: |
121 | 2 | Alexander Blum | * Instrument |
122 | 2 | Alexander Blum | * Production |
123 | 2 | Alexander Blum | * Mixing |
124 | 2 | Alexander Blum | * Mastering |
125 | 2 | Alexander Blum | * Release |
126 | 2 | Alexander Blum | * Copyright |
127 | 2 | Alexander Blum | * Artwork |
128 | 2 | Alexander Blum | * Text |
129 | 2 | Alexander Blum | * Layout |
130 | 2 | Alexander Blum | * Ancillary Copyright |
131 | 2 | Alexander Blum | * Production |
132 | 2 | Alexander Blum | * Mixing |
133 | 2 | Alexander Blum | * Mastering |
134 | 2 | Alexander Blum | * may have a start and end date |
135 | 2 | Alexander Blum | * may be restricted to a territory |
136 | 2 | Alexander Blum | * may have a successor |
137 | 2 | Alexander Blum | * may be represented by a collecting society |
138 | 1 | Alexander Blum | * may have a list of instruments for Creation -> Ancillary Copyright -> Instrument |
139 | 3 | Alexander Blum | * Rightsholder subjects for creations and releases are Artists |
140 | 1 | Alexander Blum | |
141 | 1 | Alexander Blum | ## Object processing |
142 | 1 | Alexander Blum | |
143 | 1 | Alexander Blum | ### Claims |
144 | 1 | Alexander Blum | |
145 | 1 | Alexander Blum | * Artist, Releases, and Creations are unclaimed, offered or claimed |
146 | 1 | Alexander Blum | * Unclaimed objects are objects, which don't belong to a web user |
147 | 1 | Alexander Blum | * Offered objects are objects, which might belong to a web user and are offered to be claimed by the web user (e.g. on registration) |
148 | 1 | Alexander Blum | * Claimed objects are objects, which "belong" to at least one web user |
149 | 1 | Alexander Blum | * Web users can (see [[Workflows#Claim]]) claim |
150 | 1 | Alexander Blum | * unclaimed and offered objects in general |
151 | 1 | Alexander Blum | * a solo artist for the current web user |
152 | 1 | Alexander Blum | * a group artist for a solo artist |
153 | 1 | Alexander Blum | * a compilation release for a solo artist (role: producer) |
154 | 1 | Alexander Blum | * a split/artist release for an artist |
155 | 1 | Alexander Blum | * a creation for an artist |
156 | 1 | Alexander Blum | * A claim implies a request for admin rights, where applicable |
157 | 1 | Alexander Blum | * Revised objects are visually promoted on the object list/details |
158 | 1 | Alexander Blum | * Unclaiming/Claiming/Revising an object implies unclaiming/claiming/revising all objects down the hierarchy (see [[Workflows#Creative-Cascades|Cascades]]) |
159 | 1 | Alexander Blum | * Claiming an unclaimed object results in an uncommitted object |
160 | 1 | Alexander Blum | * Claiming a claimed object |
161 | 1 | Alexander Blum | * results in a [[Specification#Disputes|dispute]] |
162 | 1 | Alexander Blum | * may result in in an uncommitted object, if the disputing party proves to be right |
163 | 1 | Alexander Blum | |
164 | 1 | Alexander Blum | ### Foreign objects |
165 | 1 | Alexander Blum | |
166 | 1 | Alexander Blum | * During object creation, a web user may create several new, [[Specification#Claims|unclaimed]] foreign objects |
167 | 1 | Alexander Blum | * When a web user creates a new Artist, he may create many member Artists specified by a name and an email |
168 | 1 | Alexander Blum | * When a web user creates a new Creation, he may create |
169 | 1 | Alexander Blum | * many contributor Artists specified by group (yes/no), a name and an email |
170 | 1 | Alexander Blum | * many original/derivative Creations specified by the creation and the artist name, resulting in |
171 | 1 | Alexander Blum | * a new/referenced Artist object: artist name |
172 | 1 | Alexander Blum | * a new Creation object: Artist |
173 | 1 | Alexander Blum | * many track Creations specified by the creation and the artist name, resulting in |
174 | 1 | Alexander Blum | * a new/referenced Artist object: artist name |
175 | 1 | Alexander Blum | * a new Creation object: Artist |
176 | 1 | Alexander Blum | * Foreign objects are auto [[Specification#Commits|commited]] |
177 | 1 | Alexander Blum | * Foreign objects may be added by others |
178 | 1 | Alexander Blum | * resulting in a duplicate for information separation |
179 | 1 | Alexander Blum | * referencing the duplicated foreign object for deduplication |
180 | 1 | Alexander Blum | * Foreign objects are editable |
181 | 1 | Alexander Blum | * by the web user, which created the foreign object, if |
182 | 1 | Alexander Blum | * the object was created by the web user |
183 | 1 | Alexander Blum | * the object is not [[Specification#Claims|claimed]] |
184 | 1 | Alexander Blum | * the object was not part of a distribution, yet |
185 | 1 | Alexander Blum | * by the object admin of the object, which the foreign object was created for |
186 | 1 | Alexander Blum | |
187 | 1 | Alexander Blum | ## File Processing |
188 | 1 | Alexander Blum | |
189 | 1 | Alexander Blum | ### Intermediate storage for archive |
190 | 1 | Alexander Blum | |
191 | 1 | Alexander Blum | * Files are stored on an intermediate file server |
192 | 1 | Alexander Blum | * For every stage of a file there is a corresponding folder `STAGE` |
193 | 1 | Alexander Blum | * For each change of processing state, all files are moved into the next `STAGE` folder |
194 | 1 | Alexander Blum | * For each folder `STAGE` there is a folder per `USER` |
195 | 1 | Alexander Blum | * Folderstructure for optimized filesystemaccess |
196 | 1 | Alexander Blum | * Semantic information for manual administrative interventions |
197 | 1 | Alexander Blum | * The filenames of the files are |
198 | 1 | Alexander Blum | * the `HASH` of the filename for temporary uploads |
199 | 1 | Alexander Blum | * the `UUID` of the content in the database for all other stages |
200 | 1 | Alexander Blum | * For each file `[UUID|HASH]` there is a file `[UUID|HASH].checksums` |
201 | 1 | Alexander Blum | * Content: Checksums for each single upload chunk |
202 | 1 | Alexander Blum | * Format: CSV (begin, end, algorithm, checksum) |
203 | 1 | Alexander Blum | * For each file `[UUID|HASH]` there is a file `UUID.checksum` |
204 | 1 | Alexander Blum | * Content: Checksum of the whole file |
205 | 1 | Alexander Blum | * Format: Plain text |
206 | 1 | Alexander Blum | * The full syntax for a file is |
207 | 1 | Alexander Blum | * `./STAGE/USER/UUID(.checksum(s))` |
208 | 1 | Alexander Blum | * `./temporary/USER/HASH(.checksum(s))` |
209 | 1 | Alexander Blum | * Examples of file paths |
210 | 1 | Alexander Blum | * `./temporary/4/82d5582443e9f8d35d3ec798662255e46e9e8138c290b626a74a3bee9382d430` |
211 | 1 | Alexander Blum | * `./temporary/4/82d5582443e9f8d35d3ec798662255e46e9e8138c290b626a74a3bee9382d430.checksums` |
212 | 1 | Alexander Blum | * `./uploaded/4/35f0c169-6594-4bf3-b285-451d2aa8c61e` |
213 | 1 | Alexander Blum | * `./uploaded/4/35f0c169-6594-4bf3-b285-451d2aa8c61e.checksums` |
214 | 1 | Alexander Blum | * `./previewed/4/35f0c169-6594-4bf3-b285-451d2aa8c61e` |
215 | 1 | Alexander Blum | * `./checksummed/4/35f0c169-6594-4bf3-b285-451d2aa8c61e` |
216 | 1 | Alexander Blum | * `./checksummed/4/35f0c169-6594-4bf3-b285-451d2aa8c61e.checksum` |
217 | 1 | Alexander Blum | * `./checksummed/4/35f0c169-6594-4bf3-b285-451d2aa8c61e.checksums` |
218 | 1 | Alexander Blum | * `./fingerprinted/4/35f0c169-6594-4bf3-b285-451d2aa8c61e` |
219 | 1 | Alexander Blum | * `./dropped/4/35f0c169-6594-4bf3-b285-451d2aa8c61e` |
220 | 1 | Alexander Blum | * `./rejected/4/35f0c169-6594-4bf3-b285-451d2aa8c61e` |
221 | 1 | Alexander Blum | |
222 | 1 | Alexander Blum | |
223 | 1 | Alexander Blum | ### Permanent storage for user content |
224 | 1 | Alexander Blum | |
225 | 1 | Alexander Blum | * User content is stored on a permanent file server |
226 | 1 | Alexander Blum | * For every content type of a file there is a corresponding folder `CONTENTYPE` (e.g. 'previews') |
227 | 1 | Alexander Blum | * For each folder `CONTENTYPE` there is a folder per `USER` |
228 | 1 | Alexander Blum | * Folderstructure for optimized filesystemaccess |
229 | 1 | Alexander Blum | * Semantic information for manual administrative interventions |
230 | 1 | Alexander Blum | * The filenames of the files are |
231 | 1 | Alexander Blum | * the `UUID` of the content in the database |
232 | 1 | Alexander Blum | * The full syntax for a file is |
233 | 1 | Alexander Blum | * `./CONTENTTYPE/USER/UUID` |
234 | 1 | Alexander Blum | * Examples of file paths |
235 | 1 | Alexander Blum | * `./previews/4/35f0c169-6594-4bf3-b285-451d2aa8c61e` |
236 | 1 | Alexander Blum | * `./excerpts/4/35f0c169-6594-4bf3-b285-451d2aa8c61e` |
237 | 1 | Alexander Blum | |
238 | 1 | Alexander Blum | ### Stages |
239 | 1 | Alexander Blum | |
240 | 1 | Alexander Blum | * For an overview of the workflow, see [[Workflows#File-Processing|Workflow]] |
241 | 1 | Alexander Blum | * Each stage of file processing results in a corresponding processing state |
242 | 1 | Alexander Blum | 1. Upload *(uploaded)* |
243 | 1 | Alexander Blum | 2. Process |
244 | 1 | Alexander Blum | * Preview *(previewed)* |
245 | 1 | Alexander Blum | * Checksum *(checksummed)* |
246 | 1 | Alexander Blum | * Fingerprint *(fingerprinted)* |
247 | 1 | Alexander Blum | 5. Drop *(dropped)* |
248 | 1 | Alexander Blum | 6. Archive *(archived)* |
249 | 1 | Alexander Blum | * There is a special state for user requested deletions *(tobedeleted)* |
250 | 1 | Alexander Blum | * Before a file is being processed at any stage, a file <UUID>.lock will be created to signal other processes to skip the file. The lockfile will be deleted after the file <UUID> has been moved to the next stage folder. |
251 | 1 | Alexander Blum | |
252 | 1 | Alexander Blum | #### Upload a file |
253 | 1 | Alexander Blum | |
254 | 1 | Alexander Blum | * Users are allowed to upload content only, if it belongs to them |
255 | 1 | Alexander Blum | * Upload of a file in chunks |
256 | 1 | Alexander Blum | * The chunk size is 1MiB (1024*1024 Byte) |
257 | 1 | Alexander Blum | * The chunk position is given by the header `Content-Range` (chunk start, chunk end, total size) |
258 | 1 | Alexander Blum | * A HASH of the user given filename is used as temporary filename |
259 | 1 | Alexander Blum | * The file is stored in `./storage/temporary/USER/HASH` |
260 | 1 | Alexander Blum | * For each uploaded chunk |
261 | 1 | Alexander Blum | * A checksum of the chunk is calculated, while the chunk is still in RAM |
262 | 1 | Alexander Blum | * The checksum is appended to `./storage/temporary/USER/HASH.checksums` (CSV: begin, end, algorithm, checksum) |
263 | 1 | Alexander Blum | * The database is queried for a duplicate of the checksum for early abuse detection |
264 | 1 | Alexander Blum | * The chunk checksum collisions are tracked in the user session |
265 | 1 | Alexander Blum | * Certain checksums are whitelisted (e.g. silence in different formats with/without headers) |
266 | 1 | Alexander Blum | * Above a configurable threshold (e.g. 150), the user upload is restricted temporarily |
267 | 1 | Alexander Blum | * The threshold violations are tracked in the database |
268 | 1 | Alexander Blum | * The chunk is appended to `./storage/temporary/USER/HASH` |
269 | 1 | Alexander Blum | * When the upload is finished |
270 | 1 | Alexander Blum | * A Uuid is generated |
271 | 1 | Alexander Blum | * If the validation of the fileextension or mimetype fails, further processing is aborted |
272 | 1 | Alexander Blum | * The files are moved to `./storage/rejected/USER/UUID(.checksums)` |
273 | 1 | Alexander Blum | * The Content is saved to the database (processing state: rejected, rejection reason: format_error, path) |
274 | 1 | Alexander Blum | * The files are moved to `./storage/uploaded/USER/UUID(.checksums)` |
275 | 1 | Alexander Blum | * The Content is saved to the database (processing state: uploaded, path, storage_hostname) |
276 | 1 | Alexander Blum | * The Checksums are saved to the database |
277 | 1 | Alexander Blum | |
278 | 1 | Alexander Blum | #### Create a preview |
279 | 1 | Alexander Blum | |
280 | 1 | Alexander Blum | * For each file in `./storage/uploaded` |
281 | 1 | Alexander Blum | * The file is locked during processing |
282 | 1 | Alexander Blum | * If the associated Content (processing state, processing hostname, storage hostname) is not valid, further processing is aborted |
283 | 1 | Alexander Blum | * The files are moved to `./storage/unknown/(USER/)UUID(.checksums)` |
284 | 1 | Alexander Blum | * The Content is updated, if possible (processing state: unkown, path) |
285 | 1 | Alexander Blum | * An excerpt for analysis and statistics is taken and stored in `./content/excerpt/UUID[1]/UUID[2]/UUID` |
286 | 1 | Alexander Blum | * Lenght: 60 seconds out of the middle of the file |
287 | 1 | Alexander Blum | * Quality: Minimum for fingerprint recognition (11025 Hz, 16 bit, mono) |
288 | 1 | Alexander Blum | * A preview is created and stored in `./content/previews/UUID[1]/UUID[2]/UUID` |
289 | 1 | Alexander Blum | * Quality: Minimum for acceptable user experience (12bit, mono, 16kHz, ogg) |
290 | 1 | Alexander Blum | * Configuration: fade in, fade out, segment interval, segment length, segment crossfade |
291 | 1 | Alexander Blum | * If preview or excerpt creation fails, further processing is aborted |
292 | 1 | Alexander Blum | * The files are moved to `./storage/rejected/USER/UUID(.checksums)` |
293 | 1 | Alexander Blum | * The Content is rejected (rejection reason: format_error, path) |
294 | 1 | Alexander Blum | * The files are moved to `./storage/previewed/USER/UUID(.checksums)` |
295 | 1 | Alexander Blum | * The audio properties are saved to the database (length, channels, sample rate, sample width) |
296 | 1 | Alexander Blum | * The Content is updated (processing state: previewed, path) |
297 | 1 | Alexander Blum | |
298 | 1 | Alexander Blum | #### Calculate a checksum |
299 | 1 | Alexander Blum | |
300 | 1 | Alexander Blum | * For each file in `./storage/previewed` |
301 | 1 | Alexander Blum | * The file is locked during processing |
302 | 1 | Alexander Blum | * If the associated Content (processing state, processing hostname, storage hostname) is not valid, further processing is aborted |
303 | 1 | Alexander Blum | * The files are moved to `./storage/unknown/(USER/)UUID(.checksums)` |
304 | 1 | Alexander Blum | * The Content is updated, if possible (processing state: unkown, path) |
305 | 1 | Alexander Blum | * Checksums for each chunk of 1MiB (1024*1024 Byte) are calculated and saved to the database, if not present |
306 | 1 | Alexander Blum | * A checksum for the whole file is calculated and stored in `./storage/previewed/USER/UUID.checksum` |
307 | 1 | Alexander Blum | * If the checksum is already present in the database, further processing is aborted |
308 | 1 | Alexander Blum | * The preview and excerpt is deleted |
309 | 1 | Alexander Blum | * The files are moved to `./storage/rejected/USER/UUID(.checksum(s))` |
310 | 1 | Alexander Blum | * The Content is rejected (rejection reason: checksum_collision, duplicate of: Content, path) |
311 | 1 | Alexander Blum | * The files are moved to `./storage/checksummed/USER/UUID(.checksum(s))` |
312 | 1 | Alexander Blum | * The Checksum is saved to the database |
313 | 1 | Alexander Blum | * The Content is updated (processing state: checksummed, path) |
314 | 1 | Alexander Blum | |
315 | 1 | Alexander Blum | #### Calculate a fingerprint |
316 | 1 | Alexander Blum | |
317 | 1 | Alexander Blum | * For each file in `./storage/checksummed` |
318 | 1 | Alexander Blum | * The file is locked during processing |
319 | 1 | Alexander Blum | * If the associated Content (processing state, processing hostname, storage hostname) is not valid, further processing is aborted |
320 | 1 | Alexander Blum | * The files are moved to `./storage/unknown/(USER/)UUID(.checksums)` |
321 | 1 | Alexander Blum | * The Content is updated, if possible (processing state: unkown, path) |
322 | 1 | Alexander Blum | * The fingerprint is created |
323 | 1 | Alexander Blum | * If the fingerprint is already present in the database, further processing is aborted |
324 | 1 | Alexander Blum | * The preview and excerpt is deleted |
325 | 1 | Alexander Blum | * The files are moved to `./storage/rejected/USER/UUID(.checksum(s))` |
326 | 1 | Alexander Blum | * The Content is rejected (rejection reason: fingerprint_collision, duplicate of: Content, path) |
327 | 1 | Alexander Blum | * The fingerprint is ingested into the database (primary key: Content Uuid) |
328 | 1 | Alexander Blum | * A FingerprintLog is saved to the database (timestamp, user, algorithm, version) |
329 | 1 | Alexander Blum | * The files are moved to `./storage/fingerprinted/USER/UUID(.checksum(s))` |
330 | 1 | Alexander Blum | * The Content is updated (processing state: fingerprinted, path) |
331 | 1 | Alexander Blum | |
332 | 1 | Alexander Blum | #### Drop a file |
333 | 1 | Alexander Blum | |
334 | 1 | Alexander Blum | * For each file in `./storage/fingerprinted` |
335 | 1 | Alexander Blum | * The file is locked during processing |
336 | 1 | Alexander Blum | * If the associated Content (processing state, processing hostname, storage hostname) is not valid, further processing is aborted |
337 | 1 | Alexander Blum | * The files are moved to `./storage/unknown/(USER/)UUID(.checksums)` |
338 | 1 | Alexander Blum | * The Content is updated, if possible (processing state: unkown, path) |
339 | 1 | Alexander Blum | * The files are moved to `./storage/dropped/USER/UUID(.checksum(s))` |
340 | 1 | Alexander Blum | * The Content is updated (processing state: dropped, path) |
341 | 1 | Alexander Blum | |
342 | 1 | Alexander Blum | #### Archive a file |
343 | 1 | Alexander Blum | |
344 | 1 | Alexander Blum | * For further details, see |
345 | 1 | Alexander Blum | * [[Specification#Archiving-of-the-files|Archiving of the files]] |
346 | 1 | Alexander Blum | * [[Specification#Orchestration-of-the-archiving|Orchestration of the archiving]] |
347 | 1 | Alexander Blum | * For each file in `./storage/dropped` |
348 | 1 | Alexander Blum | * The file is locked during processing |
349 | 1 | Alexander Blum | * If the associated Content (processing state, processing hostname, storage hostname) is not valid, further processing is aborted |
350 | 1 | Alexander Blum | * The files are moved to `./storage/unknown/(USER/)UUID(.checksum(s))` |
351 | 1 | Alexander Blum | * The Content is updated, if possible (processing state: unkown) |
352 | 1 | Alexander Blum | * The files are moved to `./storage/dropped.closed/UUID(.checksum(s))` |
353 | 1 | Alexander Blum | * The Storehouse target `LOCATION`s are copied from `./storage/targets/` to `./storage/dropped.closed.targets/` |
354 | 1 | Alexander Blum | * Until there is no `LOCATION` in `./storage/dropped.closed.targets/` left |
355 | 1 | Alexander Blum | * Until the checksums of the whole files are valid on the target machine |
356 | 1 | Alexander Blum | * The files `UUID(.checksum(s))` in `./storage/dropped.closed/` are copied to `LOCATION:./UUID[1]/UUID[2]/` |
357 | 1 | Alexander Blum | * The target location `./storage/dropped.closed.targets/LOCATION` is deleted |
358 | 1 | Alexander Blum | * The target location folder `./storage/dropped.closed.targets/` is deleted |
359 | 1 | Alexander Blum | * The Content is updated (processing state: archived, archive: Archive, path) |
360 | 1 | Alexander Blum | |
361 | 1 | Alexander Blum | #### Delete a file |
362 | 1 | Alexander Blum | |
363 | 1 | Alexander Blum | * A user may request the deletion of uncommited Content |
364 | 1 | Alexander Blum | * A corresponding files might be only deleted, if the content is in the state 'uploaded' or 'rejected' |
365 | 1 | Alexander Blum | * Further the request for the deletion needs to trigger the deletion of |
366 | 1 | Alexander Blum | * the Content object |
367 | 1 | Alexander Blum | * the preview and excerpt file |
368 | 1 | Alexander Blum | * the entry in the echoprint server referencing the deleted Content object (maybe decentralized) |
369 | 1 | Alexander Blum | |
370 | 1 | Alexander Blum | ## Archiving |
371 | 1 | Alexander Blum | |
372 | 1 | Alexander Blum | ### Objects |
373 | 1 | Alexander Blum | |
374 | 1 | Alexander Blum | #### Archiving |
375 | 1 | Alexander Blum | |
376 | 1 | Alexander Blum | * The archiving process is coordinated via objects in the Database shown here [[Databasemodels#Archiving|as diagram]] |
377 | 1 | Alexander Blum | * Physical: [[Specification#Storehouse|Storehouse]], [[Specification#Harddisk|Harddisk]], [[Specification#Filesystem|Filesystem]], [[Specification#Content|Content]] |
378 | 1 | Alexander Blum | * Logical: [[Specification#HarddiskLabel|HarddiskLabel]], [[Specification#FilesystemLabel|FilesystemLabel]] |
379 | 1 | Alexander Blum | * Integrity: [[Specification#Checksum|Checksum]], [[Specification#HarddiskTest|HarddiskTest]] |
380 | 1 | Alexander Blum | * The archiving objects are administered via tryton client and [[Scripts]] |
381 | 1 | Alexander Blum | |
382 | 1 | Alexander Blum | #### Storehouse |
383 | 1 | Alexander Blum | |
384 | 1 | Alexander Blum | Physical storage location |
385 | 1 | Alexander Blum | |
386 | 1 | Alexander Blum | * has a code |
387 | 1 | Alexander Blum | * has an admin user |
388 | 1 | Alexander Blum | * may have a detailed description |
389 | 1 | Alexander Blum | * may have many Harddisks |
390 | 1 | Alexander Blum | |
391 | 1 | Alexander Blum | #### Harddisk |
392 | 1 | Alexander Blum | |
393 | 1 | Alexander Blum | Physical harddisks |
394 | 1 | Alexander Blum | |
395 | 1 | Alexander Blum | * has uuids (host, harddisk) |
396 | 1 | Alexander Blum | * has checksums (harddisk) |
397 | 1 | Alexander Blum | * has a HarddiskLabel |
398 | 1 | Alexander Blum | * has a Storehouse |
399 | 1 | Alexander Blum | * has a version (for Harddisks with the same HarddiskLabel per Storehouse) |
400 | 1 | Alexander Blum | * has a closed state |
401 | 1 | Alexander Blum | * has a on-/offline state |
402 | 1 | Alexander Blum | * has an usage state |
403 | 1 | Alexander Blum | * has a creator (tryton user) |
404 | 1 | Alexander Blum | * has a function to generate a label sticker |
405 | 1 | Alexander Blum | * may have a local position (e.g. "Shelf1") |
406 | 1 | Alexander Blum | * may have many Filesystems |
407 | 1 | Alexander Blum | * may have many HarddiskTests |
408 | 1 | Alexander Blum | * may have a health state (result of the tests) |
409 | 1 | Alexander Blum | |
410 | 1 | Alexander Blum | #### Filesystem |
411 | 1 | Alexander Blum | |
412 | 1 | Alexander Blum | Filesystems on a harddisk |
413 | 1 | Alexander Blum | |
414 | 1 | Alexander Blum | * has uuids (partition, raid, raid sub, crypto, lvm, filesystem) |
415 | 1 | Alexander Blum | * has checksums (partition, raid, raid sub, crypto, lvm, filesystem) |
416 | 1 | Alexander Blum | * has an FilesystemLabel |
417 | 1 | Alexander Blum | * has a Harddisk |
418 | 1 | Alexander Blum | * has a closed state |
419 | 1 | Alexander Blum | * has partitioning information (partition number) |
420 | 1 | Alexander Blum | * has raid information (raid type, raid number, raid total) |
421 | 1 | Alexander Blum | |
422 | 1 | Alexander Blum | #### Content |
423 | 1 | Alexander Blum | |
424 | 1 | Alexander Blum | Contents on a filesystem |
425 | 1 | Alexander Blum | |
426 | 1 | Alexander Blum | * has a file |
427 | 1 | Alexander Blum | * may have one FilesystemLabel |
428 | 1 | Alexander Blum | |
429 | 1 | Alexander Blum | #### HarddiskLabel |
430 | 1 | Alexander Blum | |
431 | 1 | Alexander Blum | Label for Harddisks containing the same Filesystems |
432 | 1 | Alexander Blum | |
433 | 1 | Alexander Blum | * has a code |
434 | 1 | Alexander Blum | * may have many Harddisks |
435 | 1 | Alexander Blum | |
436 | 1 | Alexander Blum | #### FilesystemLabel |
437 | 1 | Alexander Blum | |
438 | 1 | Alexander Blum | Label for Filesystems containing the same Contents |
439 | 1 | Alexander Blum | |
440 | 1 | Alexander Blum | * has a code |
441 | 1 | Alexander Blum | * may have many Filesystems |
442 | 1 | Alexander Blum | * may have many Contents |
443 | 1 | Alexander Blum | |
444 | 1 | Alexander Blum | #### Checksum |
445 | 1 | Alexander Blum | |
446 | 1 | Alexander Blum | Checksum, e.g. sha256 |
447 | 1 | Alexander Blum | |
448 | 1 | Alexander Blum | * has a timestamp |
449 | 1 | Alexander Blum | * has a begin (first Byte) |
450 | 1 | Alexander Blum | * has an end (last Byte) |
451 | 1 | Alexander Blum | * has an algorithm |
452 | 1 | Alexander Blum | |
453 | 1 | Alexander Blum | #### HarddiskTest |
454 | 1 | Alexander Blum | |
455 | 1 | Alexander Blum | Integrity tests of harddisks |
456 | 1 | Alexander Blum | |
457 | 1 | Alexander Blum | * has a timestamp |
458 | 1 | Alexander Blum | * has a user, which performed the test |
459 | 1 | Alexander Blum | * has a health state (sane + error for each checksum of Harddisk and Filesystem) |
460 | 1 | Alexander Blum | |
461 | 1 | Alexander Blum | ### Identification with uuids |
462 | 1 | Alexander Blum | |
463 | 1 | Alexander Blum | * A uuid is a Universally Unique Identifier |
464 | 1 | Alexander Blum | * Harddisks, Filesystems and Content are identified with uuids |
465 | 1 | Alexander Blum | * A Harddisk is identified with a combination of the uuids |
466 | 1 | Alexander Blum | * Host |
467 | 1 | Alexander Blum | * Harddisk |
468 | 1 | Alexander Blum | * A Filesystem is identified with with a combination of the uuids |
469 | 1 | Alexander Blum | * Partition |
470 | 1 | Alexander Blum | * Raid |
471 | 1 | Alexander Blum | * Raid Sub |
472 | 1 | Alexander Blum | * Crypto |
473 | 1 | Alexander Blum | * LVM |
474 | 1 | Alexander Blum | * Filesystem |
475 | 1 | Alexander Blum | * The uuids of a Harddisk and a contained Filesystem are strictly hierarchical: |
476 | 1 | Alexander Blum | * Host > Harddisk > Partition > Raid > Raid Sub > Crypto > LVM > Filesystem |
477 | 1 | Alexander Blum | * A Content is identified with exactly one uuid |
478 | 1 | Alexander Blum | |
479 | 1 | Alexander Blum | ### Integrity tests with checksums |
480 | 1 | Alexander Blum | |
481 | 1 | Alexander Blum | * When a Harddisk is finalized, the follwing Checksums are saved into the database |
482 | 1 | Alexander Blum | * Checksum of the Harddisk: Harddisk |
483 | 1 | Alexander Blum | * Checksums of the Filesystem: Partition, Raid, Raid Sub, Crypto, LVM, Filesystem |
484 | 1 | Alexander Blum | * For each Content the following Checksums are saved to database |
485 | 1 | Alexander Blum | * Checksum for the whole file |
486 | 1 | Alexander Blum | * Checksums for each upload chunk |
487 | 1 | Alexander Blum | * All checksums are additionally saved on the harddisk as second indepentend source |
488 | 1 | Alexander Blum | * Harddisk/Filesystem: on a metadata partition |
489 | 1 | Alexander Blum | * File/Chunks: on the filesystem |
490 | 1 | Alexander Blum | * A checksum of the metadata partition is saved only on the harddisk |
491 | 1 | Alexander Blum | * Regularly schedulled (e.g. biannual) integrity tests ensure the logterm integrity of all Harddisks |
492 | 1 | Alexander Blum | * For each test |
493 | 1 | Alexander Blum | * Check of uuids |
494 | 1 | Alexander Blum | * Check of checksum of metadata partition |
495 | 1 | Alexander Blum | * Activation of crypto and raid |
496 | 1 | Alexander Blum | * Check of checksums of filesystems /dev/disk/by-uuid/... |
497 | 1 | Alexander Blum | * On error: |
498 | 1 | Alexander Blum | * Feedback Admin |
499 | 1 | Alexander Blum | * Write HarddiskTest (state: error_filesystem) |
500 | 1 | Alexander Blum | * Check of Checksumms of all files |
501 | 1 | Alexander Blum | * Write HarddiskTest (state: sane) |
502 | 1 | Alexander Blum | * An error might be tracked down to the resolution of the uploads chunks, if neccessary |
503 | 1 | Alexander Blum | |
504 | 1 | Alexander Blum | |
505 | 1 | Alexander Blum | ### Archiving of the files |
506 | 1 | Alexander Blum | |
507 | 1 | Alexander Blum | * The files associated with a Content are stored on a filesystem |
508 | 1 | Alexander Blum | * Each file is named after the Content Uuid: `UUID` |
509 | 1 | Alexander Blum | * Each file is associated with two other files by convention: |
510 | 1 | Alexander Blum | * `UUID.checksum`: checksum of the whole file |
511 | 1 | Alexander Blum | * `UUID.checksums`: checksums of all upload chunks |
512 | 1 | Alexander Blum | * Each file is stored in the folder `./UUID[0]/UUID[1]/` |
513 | 1 | Alexander Blum | * Folderstructure for optimized filesystemaccess |
514 | 1 | Alexander Blum | * No semantic information (e.g. USER) to avoid the need for corrections |
515 | 1 | Alexander Blum | * The full syntax for a file is |
516 | 1 | Alexander Blum | * `./UUID[0]/UUID[1]/UUID(.checksum(s))` |
517 | 1 | Alexander Blum | * Examples of file paths |
518 | 1 | Alexander Blum | * `./3/5/35f0c169-6594-4bf3-b285-451d2aa8c61e` |
519 | 1 | Alexander Blum | |
520 | 1 | Alexander Blum | |
521 | 1 | Alexander Blum | ### Orchestration of the archiving |
522 | 1 | Alexander Blum | |
523 | 1 | Alexander Blum | * Files on an [[Specification#Intermediate-storage-for-archive|intermediate storage]] may be archived in many storehouses |
524 | 1 | Alexander Blum | * The URLs to the storehouse machines are stored in STOREHOUSECODE target files |
525 | 1 | Alexander Blum | * The mappings of all intermediate storages to storehouses are administered on an central orchestration machine |
526 | 1 | Alexander Blum | * Syntax: `./STORAGE/STOREHOUSECODE` |
527 | 1 | Alexander Blum | * Example: `./storage001/DÜ1` |
528 | 1 | Alexander Blum | * The mapping of a single intermediate storage to storehouses are mirrored to this intermediate storage |
529 | 1 | Alexander Blum | * Syntax: `./storage/targets/STOREHOUSECODE` |
530 | 1 | Alexander Blum | * Example: `./storage/targets/DÜ1` |
531 | 1 | Alexander Blum | * The files on the intermediate storage are synchronized with the files on the orchestration machine for orchestration |
532 | 1 | Alexander Blum | |
533 | 1 | Alexander Blum | |
534 | 1 | Alexander Blum | ### Human readable label sticker |
535 | 1 | Alexander Blum | |
536 | 1 | Alexander Blum | * For each Harddisk, a label sticker may be generated |
537 | 1 | Alexander Blum | * Header: `PURPOSE-STOREHOUSECODE-CONTAINERLABELCODE (RAIDTYPE: RAIDNUMBER/RAIDTOTAL)` |
538 | 1 | Alexander Blum | * `PURPOSE`: "UCR" ("User Content Repertoire") |
539 | 1 | Alexander Blum | * `STOREHOUSECODE`: Code of the Storehouse, e.g. "DÜ1" for the first storehouse in Düsseldorf |
540 | 1 | Alexander Blum | * `CONTAINERLABELCODE`: Incremental, padded to 5 digits, e.g. 00001 |
541 | 1 | Alexander Blum | * `RAIDTYPE`: Typ of the raid |
542 | 1 | Alexander Blum | * `RAIDNUMBER`: number of the harddisk in the raid |
543 | 1 | Alexander Blum | * `RAIDTOTAL`: total number of harddisks in the raid |
544 | 1 | Alexander Blum | * Details: List of `ARCHIVELABELCODE`s |
545 | 1 | Alexander Blum | * `ARCHIVELABELCODE`: Label of the Archive |
546 | 1 | Alexander Blum | |
547 | 1 | Alexander Blum | # Sort |
548 | 1 | Alexander Blum | |
549 | 1 | Alexander Blum | * Only more liberal licenses should be allowed. |