Specification » History » Version 5

Alexander Blum, 03/08/2023 03:03 AM

1 1 Alexander Blum
{{toc}}
2 1 Alexander Blum
3 1 Alexander Blum
# General
4 1 Alexander Blum
5 1 Alexander Blum
## Object Processing
6 1 Alexander Blum
7 1 Alexander Blum
### Commits
8 1 Alexander Blum
9 1 Alexander Blum
* Web users provide the system with metadata on objects as information basis for the distribution of money
10 1 Alexander Blum
* The system needs to balance the need of web users to change the metadata and the stability of the information basis
11 1 Alexander Blum
* Artists, Releases, Creations have to be committed by the web users as a conscious act
12 1 Alexander Blum
    * No data is published before a commit
13 1 Alexander Blum
    * Web users are able to distinguish public/private fields after a commit
14 1 Alexander Blum
    * A commit of a set of objects (all fields, public fields) is saved as rendered text for evidence
15 1 Alexander Blum
* Creations do not need to have content to be committable
16 1 Alexander Blum
* Content with a relation to a Creation is autocommitted, when the creation is committed
17 1 Alexander Blum
* Before a commit, web users may edit the object freely
18 1 Alexander Blum
* Committing an object implies committing all objects down the hierarchy (see [[Workflows#Licenser-Cascades|Cascades]])
19 1 Alexander Blum
* After a commit, web users may
20 1 Alexander Blum
    * edit only data not relevant for the distribution or object to frequent changes:
21 1 Alexander Blum
        * adding/removing members of an Artist
22 1 Alexander Blum
        * adding a Release to a Creation
23 1 Alexander Blum
        * changing Release metadata
24 1 Alexander Blum
    * trigger a dispute request for changing relevant data:
25 1 Alexander Blum
        * deleting an object (implies creation of duplicate for references)
26 1 Alexander Blum
        * adding/removing a contributor to a Creation
27 1 Alexander Blum
        * adding/removing a original/derivative Creation
28 1 Alexander Blum
        * adding/removing licenses to Releases of Creations
29 1 Alexander Blum
        * adding/removing the Content assigned to a Creation
30 1 Alexander Blum
        * removing a Release from a Creation
31 1 Alexander Blum
* After a commit, administrators may
32 1 Alexander Blum
    * revise the object(s)
33 1 Alexander Blum
    * reject the object(s)
34 1 Alexander Blum
* A successful claim dispute (see [[Workflows#Commit]]) causes the object to be uncommitted
35 1 Alexander Blum
* Uncommitted objects should be recommitted again as soon as possible by the web user
36 1 Alexander Blum
    * if the artist name or creation title is changed on recommit, all admins of referencing objects are informed via email/webinterface
37 1 Alexander Blum
    * if a referenced object is deleted before recommit, a duplicate for all references is created and referenced instead
38 1 Alexander Blum
* Created / Uncommitted objects
39 1 Alexander Blum
    * are promoted on the web user dashboard
40 1 Alexander Blum
    * excluded from searches (via add/list/etc), except
41 1 Alexander Blum
        * web users can add own uncommitted objects to other own uncommitted objects
42 1 Alexander Blum
        * web users can see but not add own uncommitted objects to other committed objects
43 1 Alexander Blum
44 1 Alexander Blum
### Disputes
45 1 Alexander Blum
46 1 Alexander Blum
* A dispute is the process of mediation in a conflict between web users
47 1 Alexander Blum
* A dispute has a
48 1 Alexander Blum
    * code: Sequence
49 1 Alexander Blum
    * state: Selection (requested, assigned, resolved)
50 1 Alexander Blum
    * case: Selection (list of usecases?)
51 1 Alexander Blum
    * object: Reference (to the disputed objects)
52 1 Alexander Blum
    * assignee: User
53 1 Alexander Blum
    * request_party: Party
54 1 Alexander Blum
    * request_text: Text (user statement)
55 1 Alexander Blum
    * request_time: DateTime
56 1 Alexander Blum
    * resolved_time: DateTime
57 1 Alexander Blum
    * comments: Comment (many comments by administrators)
58 1 Alexander Blum
* A dispute comment has a
59 1 Alexander Blum
    * dispute: Dispute
60 1 Alexander Blum
    * text: Text
61 1 Alexander Blum
    * time: DateTime
62 1 Alexander Blum
* A dispute may be triggered by a web user on several occasions
63 1 Alexander Blum
    * The web user [[Specification#Claims|claims]] ownership of an already claimed object
64 1 Alexander Blum
    * The web user claims authorship of a Content marked as a duplicate
65 1 Alexander Blum
    * The web user requests a change/deletion of [[Specification#Commits|commited]] content
66 1 Alexander Blum
* A dispute can be requested, assigned and resolved (see [[Workflows#Dispute]])
67 1 Alexander Blum
* A dispute is requested via a dispute form including
68 1 Alexander Blum
    * The disputed object
69 1 Alexander Blum
    * The issue category
70 1 Alexander Blum
    * The user motivation
71 1 Alexander Blum
* A dispute is handled by an assignee
72 1 Alexander Blum
* The assignee can add many comments
73 1 Alexander Blum
* The assignee can mark the dispute as resolved
74 1 Alexander Blum
75 1 Alexander Blum
# Licensing
76 4 Alexander Blum
77 4 Alexander Blum
## Tariff System
78 4 Alexander Blum
79 4 Alexander Blum
* A creation can have several tariff categories represented by different collecting societies
80 1 Alexander Blum
81 1 Alexander Blum
## API
82 1 Alexander Blum
83 5 Alexander Blum
* TODO
84 5 Alexander Blum
85 5 Alexander Blum
## Collection
86 5 Alexander Blum
87 5 Alexander Blum
## Allocation
88 5 Alexander Blum
    * Invoice
89 5 Alexander Blum
90 5 Alexander Blum
## Declaration
91 5 Alexander Blum
92 5 Alexander Blum
## Utilisation
93 5 Alexander Blum
* Tariff
94 5 Alexander Blum
* UtilisationCreationList ([[Usecases#Examples]])
95 5 Alexander Blum
* UtilisationIndicator (Base, Relevance, Adjustments)
96 5 Alexander Blum
* ContextIndicator
97 5 Alexander Blum
* ...
98 5 Alexander Blum
99 5 Alexander Blum
## Indicators
100 5 Alexander Blum
101 5 Alexander Blum
* UtilisationIndicator
102 5 Alexander Blum
* ContextIndicators (for each tariff)
103 5 Alexander Blum
104 1 Alexander Blum
# Repertoire
105 1 Alexander Blum
106 2 Alexander Blum
## Rights Management
107 1 Alexander Blum
108 3 Alexander Blum
* A rightsholder
109 2 Alexander Blum
    * must hold a right
110 2 Alexander Blum
        * Copyright
111 2 Alexander Blum
        * Ancillary Copyright
112 2 Alexander Blum
    * must have an object to which the right belongs
113 2 Alexander Blum
        * Creation
114 2 Alexander Blum
        * Release
115 2 Alexander Blum
    * must have a contribution depending on the object and right
116 2 Alexander Blum
        * Creation
117 2 Alexander Blum
            * Copyright
118 2 Alexander Blum
                * Lyrics
119 2 Alexander Blum
                * Composition
120 2 Alexander Blum
            * Ancillary Copyright:
121 2 Alexander Blum
                * Instrument
122 2 Alexander Blum
                * Production
123 2 Alexander Blum
                * Mixing
124 2 Alexander Blum
                * Mastering
125 2 Alexander Blum
        * Release
126 2 Alexander Blum
            * Copyright
127 2 Alexander Blum
                * Artwork
128 2 Alexander Blum
                * Text
129 2 Alexander Blum
                * Layout
130 2 Alexander Blum
            * Ancillary Copyright
131 2 Alexander Blum
                * Production
132 2 Alexander Blum
                * Mixing
133 2 Alexander Blum
                * Mastering
134 2 Alexander Blum
    * may have a start and end date
135 2 Alexander Blum
    * may be restricted to a territory
136 2 Alexander Blum
    * may have a successor
137 2 Alexander Blum
    * may be represented by a collecting society
138 1 Alexander Blum
    * may have a list of instruments for Creation -> Ancillary Copyright -> Instrument
139 3 Alexander Blum
* Rightsholder subjects for creations and releases are Artists
140 1 Alexander Blum
141 1 Alexander Blum
## Object processing
142 1 Alexander Blum
143 1 Alexander Blum
### Claims
144 1 Alexander Blum
145 1 Alexander Blum
* Artist, Releases, and Creations are unclaimed, offered or claimed
146 1 Alexander Blum
* Unclaimed objects are objects, which don't belong to a web user
147 1 Alexander Blum
* Offered objects are objects, which might belong to a web user and are offered to be claimed by the web user (e.g. on registration)
148 1 Alexander Blum
* Claimed objects are objects, which "belong" to at least one web user
149 1 Alexander Blum
* Web users can (see [[Workflows#Claim]]) claim
150 1 Alexander Blum
    * unclaimed and offered objects in general
151 1 Alexander Blum
    * a solo artist for the current web user
152 1 Alexander Blum
    * a group artist for a solo artist
153 1 Alexander Blum
    * a compilation release for a solo artist (role: producer)
154 1 Alexander Blum
    * a split/artist release for an artist
155 1 Alexander Blum
    * a creation for an artist
156 1 Alexander Blum
* A claim implies a request for admin rights, where applicable
157 1 Alexander Blum
* Revised objects are visually promoted on the object list/details
158 1 Alexander Blum
* Unclaiming/Claiming/Revising an object implies unclaiming/claiming/revising all objects down the hierarchy (see [[Workflows#Creative-Cascades|Cascades]])
159 1 Alexander Blum
* Claiming an unclaimed object results in an uncommitted object
160 1 Alexander Blum
* Claiming a claimed object
161 1 Alexander Blum
    * results in a [[Specification#Disputes|dispute]]
162 1 Alexander Blum
    * may result in in an uncommitted object, if the disputing party proves to be right
163 1 Alexander Blum
164 1 Alexander Blum
### Foreign objects
165 1 Alexander Blum
166 1 Alexander Blum
* During object creation, a web user may create several new, [[Specification#Claims|unclaimed]] foreign objects
167 1 Alexander Blum
* When a web user creates a new Artist, he may create many member Artists specified by a name and an email
168 1 Alexander Blum
* When a web user creates a new Creation, he may create
169 1 Alexander Blum
    * many contributor Artists specified by group (yes/no), a name and an email
170 1 Alexander Blum
    * many original/derivative Creations specified by the creation and the artist name, resulting in
171 1 Alexander Blum
        * a new/referenced Artist object: artist name
172 1 Alexander Blum
        * a new Creation object: Artist
173 1 Alexander Blum
    * many track Creations specified by the creation and the artist name, resulting in
174 1 Alexander Blum
        * a new/referenced Artist object: artist name
175 1 Alexander Blum
        * a new Creation object: Artist
176 1 Alexander Blum
* Foreign objects are auto [[Specification#Commits|commited]]
177 1 Alexander Blum
* Foreign objects may be added by others
178 1 Alexander Blum
    * resulting in a duplicate for information separation
179 1 Alexander Blum
    * referencing the duplicated foreign object for deduplication
180 1 Alexander Blum
* Foreign objects are editable
181 1 Alexander Blum
    * by the web user, which created the foreign object, if
182 1 Alexander Blum
        * the object was created by the web user
183 1 Alexander Blum
        * the object is not [[Specification#Claims|claimed]]
184 1 Alexander Blum
        * the object was not part of a distribution, yet
185 1 Alexander Blum
    * by the object admin of the object, which the foreign object was created for
186 1 Alexander Blum
187 1 Alexander Blum
## File Processing
188 1 Alexander Blum
189 1 Alexander Blum
### Intermediate storage for archive
190 1 Alexander Blum
191 1 Alexander Blum
* Files are stored on an intermediate file server
192 1 Alexander Blum
* For every stage of a file there is a corresponding folder `STAGE`
193 1 Alexander Blum
* For each change of processing state, all files are moved into the next `STAGE` folder
194 1 Alexander Blum
* For each folder `STAGE` there is a folder per `USER`
195 1 Alexander Blum
    * Folderstructure for optimized filesystemaccess
196 1 Alexander Blum
    * Semantic information for manual administrative interventions
197 1 Alexander Blum
* The filenames of the files are
198 1 Alexander Blum
    * the `HASH` of the filename for temporary uploads
199 1 Alexander Blum
    * the `UUID` of the content in the database for all other stages
200 1 Alexander Blum
* For each file `[UUID|HASH]` there is a file `[UUID|HASH].checksums`
201 1 Alexander Blum
    * Content: Checksums for each single upload chunk
202 1 Alexander Blum
    * Format: CSV (begin, end, algorithm, checksum)
203 1 Alexander Blum
* For each file `[UUID|HASH]` there is a file `UUID.checksum`
204 1 Alexander Blum
    * Content: Checksum of the whole file
205 1 Alexander Blum
    * Format: Plain text
206 1 Alexander Blum
* The full syntax for a file is
207 1 Alexander Blum
    * `./STAGE/USER/UUID(.checksum(s))`
208 1 Alexander Blum
    * `./temporary/USER/HASH(.checksum(s))`
209 1 Alexander Blum
* Examples of file paths
210 1 Alexander Blum
    * `./temporary/4/82d5582443e9f8d35d3ec798662255e46e9e8138c290b626a74a3bee9382d430`
211 1 Alexander Blum
    * `./temporary/4/82d5582443e9f8d35d3ec798662255e46e9e8138c290b626a74a3bee9382d430.checksums`
212 1 Alexander Blum
    * `./uploaded/4/35f0c169-6594-4bf3-b285-451d2aa8c61e`
213 1 Alexander Blum
    * `./uploaded/4/35f0c169-6594-4bf3-b285-451d2aa8c61e.checksums`
214 1 Alexander Blum
    * `./previewed/4/35f0c169-6594-4bf3-b285-451d2aa8c61e`
215 1 Alexander Blum
    * `./checksummed/4/35f0c169-6594-4bf3-b285-451d2aa8c61e`
216 1 Alexander Blum
    * `./checksummed/4/35f0c169-6594-4bf3-b285-451d2aa8c61e.checksum`
217 1 Alexander Blum
    * `./checksummed/4/35f0c169-6594-4bf3-b285-451d2aa8c61e.checksums`
218 1 Alexander Blum
    * `./fingerprinted/4/35f0c169-6594-4bf3-b285-451d2aa8c61e`
219 1 Alexander Blum
    * `./dropped/4/35f0c169-6594-4bf3-b285-451d2aa8c61e`
220 1 Alexander Blum
    * `./rejected/4/35f0c169-6594-4bf3-b285-451d2aa8c61e`
221 1 Alexander Blum
222 1 Alexander Blum
223 1 Alexander Blum
### Permanent storage for user content
224 1 Alexander Blum
225 1 Alexander Blum
* User content is stored on a permanent file server
226 1 Alexander Blum
* For every content type of a file there is a corresponding folder `CONTENTYPE` (e.g. 'previews')
227 1 Alexander Blum
* For each folder `CONTENTYPE` there is a folder per `USER`
228 1 Alexander Blum
    * Folderstructure for optimized filesystemaccess
229 1 Alexander Blum
    * Semantic information for manual administrative interventions
230 1 Alexander Blum
* The filenames of the files are
231 1 Alexander Blum
    * the `UUID` of the content in the database
232 1 Alexander Blum
* The full syntax for a file is
233 1 Alexander Blum
    * `./CONTENTTYPE/USER/UUID`
234 1 Alexander Blum
* Examples of file paths
235 1 Alexander Blum
    * `./previews/4/35f0c169-6594-4bf3-b285-451d2aa8c61e`
236 1 Alexander Blum
    * `./excerpts/4/35f0c169-6594-4bf3-b285-451d2aa8c61e`
237 1 Alexander Blum
238 1 Alexander Blum
### Stages
239 1 Alexander Blum
240 1 Alexander Blum
* For an overview of the workflow, see [[Workflows#File-Processing|Workflow]]
241 1 Alexander Blum
* Each stage of file processing results in a corresponding processing state
242 1 Alexander Blum
    1. Upload *(uploaded)*
243 1 Alexander Blum
    2. Process
244 1 Alexander Blum
        * Preview *(previewed)*
245 1 Alexander Blum
        * Checksum *(checksummed)*
246 1 Alexander Blum
        * Fingerprint *(fingerprinted)*
247 1 Alexander Blum
    5. Drop *(dropped)*
248 1 Alexander Blum
    6. Archive *(archived)*
249 1 Alexander Blum
* There is a special state for user requested deletions *(tobedeleted)*
250 1 Alexander Blum
* Before a file is being processed at any stage, a file <UUID>.lock will be created to signal other processes to skip the file. The lockfile will be deleted after the file <UUID> has been moved to the next stage folder.
251 1 Alexander Blum
252 1 Alexander Blum
#### Upload a file
253 1 Alexander Blum
254 1 Alexander Blum
* Users are allowed to upload content only, if it belongs to them
255 1 Alexander Blum
* Upload of a file in chunks
256 1 Alexander Blum
    * The chunk size is 1MiB (1024*1024 Byte)
257 1 Alexander Blum
    * The chunk position is given by the header `Content-Range` (chunk start, chunk end, total size)
258 1 Alexander Blum
    * A HASH of the user given filename is used as temporary filename
259 1 Alexander Blum
    * The file is stored in `./storage/temporary/USER/HASH`
260 1 Alexander Blum
* For each uploaded chunk
261 1 Alexander Blum
    * A checksum of the chunk is calculated, while the chunk is still in RAM
262 1 Alexander Blum
    * The checksum is appended to `./storage/temporary/USER/HASH.checksums` (CSV: begin, end, algorithm, checksum)
263 1 Alexander Blum
    * The database is queried for a duplicate of the checksum for early abuse detection
264 1 Alexander Blum
        * The chunk checksum collisions are tracked in the user session
265 1 Alexander Blum
        * Certain checksums are whitelisted (e.g. silence in different formats with/without headers)
266 1 Alexander Blum
        * Above a configurable threshold (e.g. 150), the user upload is restricted temporarily
267 1 Alexander Blum
        * The threshold violations are tracked in the database
268 1 Alexander Blum
    * The chunk is appended to `./storage/temporary/USER/HASH`
269 1 Alexander Blum
* When the upload is finished
270 1 Alexander Blum
    * A Uuid is generated
271 1 Alexander Blum
    * If the validation of the fileextension or mimetype fails, further processing is aborted
272 1 Alexander Blum
        * The files are moved to `./storage/rejected/USER/UUID(.checksums)`
273 1 Alexander Blum
        * The Content is saved to the database (processing state: rejected, rejection reason: format_error, path)
274 1 Alexander Blum
    * The files are moved to `./storage/uploaded/USER/UUID(.checksums)`
275 1 Alexander Blum
    * The Content is saved to the database (processing state: uploaded, path, storage_hostname)
276 1 Alexander Blum
    * The Checksums are saved to the database
277 1 Alexander Blum
278 1 Alexander Blum
#### Create a preview
279 1 Alexander Blum
280 1 Alexander Blum
* For each file in `./storage/uploaded`
281 1 Alexander Blum
    * The file is locked during processing
282 1 Alexander Blum
    * If the associated Content (processing state, processing hostname, storage hostname) is not valid, further processing is aborted
283 1 Alexander Blum
        * The files are moved to `./storage/unknown/(USER/)UUID(.checksums)`
284 1 Alexander Blum
        * The Content is updated, if possible (processing state: unkown, path)
285 1 Alexander Blum
    * An excerpt for analysis and statistics is taken and stored in `./content/excerpt/UUID[1]/UUID[2]/UUID`
286 1 Alexander Blum
        * Lenght: 60 seconds out of the middle of the file
287 1 Alexander Blum
        * Quality: Minimum for fingerprint recognition (11025 Hz, 16 bit, mono)
288 1 Alexander Blum
    * A preview is created and stored in `./content/previews/UUID[1]/UUID[2]/UUID`
289 1 Alexander Blum
        * Quality: Minimum for acceptable user experience (12bit, mono, 16kHz, ogg)
290 1 Alexander Blum
        * Configuration: fade in, fade out, segment interval, segment length, segment crossfade
291 1 Alexander Blum
    * If preview or excerpt creation fails, further processing is aborted
292 1 Alexander Blum
        * The files are moved to `./storage/rejected/USER/UUID(.checksums)`
293 1 Alexander Blum
        * The Content is rejected (rejection reason: format_error, path)
294 1 Alexander Blum
    * The files are moved to `./storage/previewed/USER/UUID(.checksums)`
295 1 Alexander Blum
    * The audio properties are saved to the database (length, channels, sample rate, sample width)
296 1 Alexander Blum
    * The Content is updated (processing state: previewed, path)
297 1 Alexander Blum
298 1 Alexander Blum
#### Calculate a checksum
299 1 Alexander Blum
300 1 Alexander Blum
* For each file in `./storage/previewed`
301 1 Alexander Blum
    * The file is locked during processing
302 1 Alexander Blum
    * If the associated Content (processing state, processing hostname, storage hostname) is not valid, further processing is aborted
303 1 Alexander Blum
        * The files are moved to `./storage/unknown/(USER/)UUID(.checksums)`
304 1 Alexander Blum
        * The Content is updated, if possible (processing state: unkown, path)
305 1 Alexander Blum
    * Checksums for each chunk of 1MiB (1024*1024 Byte) are calculated and saved to the database, if not present
306 1 Alexander Blum
    * A checksum for the whole file is calculated and stored in `./storage/previewed/USER/UUID.checksum`
307 1 Alexander Blum
    * If the checksum is already present in the database, further processing is aborted
308 1 Alexander Blum
        * The preview and excerpt is deleted
309 1 Alexander Blum
        * The files are moved to `./storage/rejected/USER/UUID(.checksum(s))`
310 1 Alexander Blum
        * The Content is rejected (rejection reason: checksum_collision, duplicate of: Content, path)
311 1 Alexander Blum
    * The files are moved to `./storage/checksummed/USER/UUID(.checksum(s))`
312 1 Alexander Blum
    * The Checksum is saved to the database
313 1 Alexander Blum
    * The Content is updated (processing state: checksummed, path)
314 1 Alexander Blum
315 1 Alexander Blum
#### Calculate a fingerprint
316 1 Alexander Blum
317 1 Alexander Blum
* For each file in `./storage/checksummed`
318 1 Alexander Blum
    * The file is locked during processing
319 1 Alexander Blum
    * If the associated Content (processing state, processing hostname, storage hostname) is not valid, further processing is aborted
320 1 Alexander Blum
        * The files are moved to `./storage/unknown/(USER/)UUID(.checksums)`
321 1 Alexander Blum
        * The Content is updated, if possible (processing state: unkown, path)
322 1 Alexander Blum
    * The fingerprint is created
323 1 Alexander Blum
    * If the fingerprint is already present in the database, further processing is aborted
324 1 Alexander Blum
        * The preview and excerpt is deleted
325 1 Alexander Blum
        * The files are moved to `./storage/rejected/USER/UUID(.checksum(s))`
326 1 Alexander Blum
        * The Content is rejected (rejection reason: fingerprint_collision, duplicate of: Content, path)
327 1 Alexander Blum
    * The fingerprint is ingested into the database (primary key: Content Uuid)
328 1 Alexander Blum
    * A FingerprintLog is saved to the database (timestamp, user, algorithm, version)
329 1 Alexander Blum
    * The files are moved to `./storage/fingerprinted/USER/UUID(.checksum(s))`
330 1 Alexander Blum
    * The Content is updated (processing state: fingerprinted, path)
331 1 Alexander Blum
332 1 Alexander Blum
#### Drop a file
333 1 Alexander Blum
334 1 Alexander Blum
* For each file in `./storage/fingerprinted`
335 1 Alexander Blum
    * The file is locked during processing
336 1 Alexander Blum
    * If the associated Content (processing state, processing hostname, storage hostname) is not valid, further processing is aborted
337 1 Alexander Blum
        * The files are moved to `./storage/unknown/(USER/)UUID(.checksums)`
338 1 Alexander Blum
        * The Content is updated, if possible (processing state: unkown, path)
339 1 Alexander Blum
    * The files are moved to `./storage/dropped/USER/UUID(.checksum(s))`
340 1 Alexander Blum
    * The Content is updated (processing state: dropped, path)
341 1 Alexander Blum
342 1 Alexander Blum
#### Archive a file
343 1 Alexander Blum
344 1 Alexander Blum
* For further details, see
345 1 Alexander Blum
    * [[Specification#Archiving-of-the-files|Archiving of the files]]
346 1 Alexander Blum
    * [[Specification#Orchestration-of-the-archiving|Orchestration of the archiving]]
347 1 Alexander Blum
* For each file in `./storage/dropped`
348 1 Alexander Blum
    * The file is locked during processing
349 1 Alexander Blum
    * If the associated Content (processing state, processing hostname, storage hostname) is not valid, further processing is aborted
350 1 Alexander Blum
        * The files are moved to `./storage/unknown/(USER/)UUID(.checksum(s))`
351 1 Alexander Blum
        * The Content is updated, if possible (processing state: unkown)
352 1 Alexander Blum
    * The files are moved to `./storage/dropped.closed/UUID(.checksum(s))`
353 1 Alexander Blum
    * The Storehouse target `LOCATION`s are copied from `./storage/targets/` to `./storage/dropped.closed.targets/`
354 1 Alexander Blum
    * Until there is no `LOCATION` in `./storage/dropped.closed.targets/` left
355 1 Alexander Blum
        * Until the checksums of the whole files are valid on the target machine
356 1 Alexander Blum
            * The files `UUID(.checksum(s))` in `./storage/dropped.closed/` are copied to `LOCATION:./UUID[1]/UUID[2]/`
357 1 Alexander Blum
        * The target location `./storage/dropped.closed.targets/LOCATION` is deleted
358 1 Alexander Blum
    * The target location folder `./storage/dropped.closed.targets/` is deleted
359 1 Alexander Blum
    * The Content is updated (processing state: archived, archive: Archive, path)
360 1 Alexander Blum
361 1 Alexander Blum
#### Delete a file
362 1 Alexander Blum
363 1 Alexander Blum
* A user may request the deletion of uncommited Content
364 1 Alexander Blum
* A corresponding files might be only deleted, if the content is in the state 'uploaded' or 'rejected'
365 1 Alexander Blum
* Further the request for the deletion needs to trigger the deletion of
366 1 Alexander Blum
     * the Content object
367 1 Alexander Blum
     * the preview and excerpt file
368 1 Alexander Blum
     * the entry in the echoprint server referencing the deleted Content object (maybe decentralized)
369 1 Alexander Blum
370 1 Alexander Blum
## Archiving
371 1 Alexander Blum
372 1 Alexander Blum
### Objects
373 1 Alexander Blum
374 1 Alexander Blum
#### Archiving
375 1 Alexander Blum
376 1 Alexander Blum
* The archiving process is coordinated via objects in the Database shown here [[Databasemodels#Archiving|as diagram]]
377 1 Alexander Blum
    * Physical: [[Specification#Storehouse|Storehouse]], [[Specification#Harddisk|Harddisk]], [[Specification#Filesystem|Filesystem]], [[Specification#Content|Content]]
378 1 Alexander Blum
    * Logical: [[Specification#HarddiskLabel|HarddiskLabel]], [[Specification#FilesystemLabel|FilesystemLabel]]
379 1 Alexander Blum
    * Integrity: [[Specification#Checksum|Checksum]], [[Specification#HarddiskTest|HarddiskTest]]
380 1 Alexander Blum
* The archiving objects are administered via tryton client and [[Scripts]]
381 1 Alexander Blum
382 1 Alexander Blum
#### Storehouse
383 1 Alexander Blum
384 1 Alexander Blum
Physical storage location
385 1 Alexander Blum
386 1 Alexander Blum
* has a code
387 1 Alexander Blum
* has an admin user
388 1 Alexander Blum
* may have a detailed description
389 1 Alexander Blum
* may have many Harddisks
390 1 Alexander Blum
391 1 Alexander Blum
#### Harddisk
392 1 Alexander Blum
393 1 Alexander Blum
Physical harddisks
394 1 Alexander Blum
395 1 Alexander Blum
* has uuids (host, harddisk)
396 1 Alexander Blum
* has checksums (harddisk)
397 1 Alexander Blum
* has a HarddiskLabel
398 1 Alexander Blum
* has a Storehouse
399 1 Alexander Blum
* has a version (for Harddisks with the same HarddiskLabel per Storehouse)
400 1 Alexander Blum
* has a closed state
401 1 Alexander Blum
* has a on-/offline state
402 1 Alexander Blum
* has an usage state
403 1 Alexander Blum
* has a creator (tryton user)
404 1 Alexander Blum
* has a function to generate a label sticker
405 1 Alexander Blum
* may have a local position (e.g. "Shelf1")
406 1 Alexander Blum
* may have many Filesystems
407 1 Alexander Blum
* may have many HarddiskTests
408 1 Alexander Blum
* may have a health state (result of the tests)
409 1 Alexander Blum
410 1 Alexander Blum
#### Filesystem
411 1 Alexander Blum
412 1 Alexander Blum
Filesystems on a harddisk
413 1 Alexander Blum
414 1 Alexander Blum
* has uuids (partition, raid, raid sub, crypto, lvm, filesystem)
415 1 Alexander Blum
* has checksums (partition, raid, raid sub, crypto, lvm, filesystem)
416 1 Alexander Blum
* has an FilesystemLabel
417 1 Alexander Blum
* has a Harddisk
418 1 Alexander Blum
* has a closed state
419 1 Alexander Blum
* has partitioning information (partition number)
420 1 Alexander Blum
* has raid information (raid type, raid number, raid total)
421 1 Alexander Blum
422 1 Alexander Blum
#### Content
423 1 Alexander Blum
424 1 Alexander Blum
Contents on a filesystem
425 1 Alexander Blum
426 1 Alexander Blum
* has a file
427 1 Alexander Blum
* may have one FilesystemLabel
428 1 Alexander Blum
429 1 Alexander Blum
#### HarddiskLabel
430 1 Alexander Blum
431 1 Alexander Blum
Label for Harddisks containing the same Filesystems
432 1 Alexander Blum
433 1 Alexander Blum
* has a code
434 1 Alexander Blum
* may have many Harddisks
435 1 Alexander Blum
436 1 Alexander Blum
#### FilesystemLabel
437 1 Alexander Blum
438 1 Alexander Blum
Label for Filesystems containing the same Contents
439 1 Alexander Blum
440 1 Alexander Blum
* has a code
441 1 Alexander Blum
* may have many Filesystems
442 1 Alexander Blum
* may have many Contents
443 1 Alexander Blum
444 1 Alexander Blum
#### Checksum
445 1 Alexander Blum
446 1 Alexander Blum
Checksum, e.g. sha256
447 1 Alexander Blum
448 1 Alexander Blum
* has a timestamp
449 1 Alexander Blum
* has a begin (first Byte)
450 1 Alexander Blum
* has an end (last Byte)
451 1 Alexander Blum
* has an algorithm
452 1 Alexander Blum
453 1 Alexander Blum
#### HarddiskTest
454 1 Alexander Blum
455 1 Alexander Blum
Integrity tests of harddisks
456 1 Alexander Blum
457 1 Alexander Blum
* has a timestamp
458 1 Alexander Blum
* has a user, which performed the test
459 1 Alexander Blum
* has a health state (sane + error for each checksum of Harddisk and Filesystem)
460 1 Alexander Blum
461 1 Alexander Blum
### Identification with uuids
462 1 Alexander Blum
463 1 Alexander Blum
* A uuid is a Universally Unique Identifier
464 1 Alexander Blum
* Harddisks, Filesystems and Content are identified with uuids
465 1 Alexander Blum
* A Harddisk is identified with a combination of the uuids
466 1 Alexander Blum
    * Host
467 1 Alexander Blum
    * Harddisk
468 1 Alexander Blum
* A Filesystem is identified with with a combination of the uuids
469 1 Alexander Blum
    * Partition
470 1 Alexander Blum
    * Raid
471 1 Alexander Blum
    * Raid Sub
472 1 Alexander Blum
    * Crypto
473 1 Alexander Blum
    * LVM
474 1 Alexander Blum
    * Filesystem
475 1 Alexander Blum
* The uuids of a Harddisk and a contained Filesystem are strictly hierarchical:
476 1 Alexander Blum
    * Host > Harddisk > Partition > Raid > Raid Sub > Crypto > LVM > Filesystem
477 1 Alexander Blum
* A Content is identified with exactly one uuid
478 1 Alexander Blum
479 1 Alexander Blum
### Integrity tests with checksums
480 1 Alexander Blum
481 1 Alexander Blum
* When a Harddisk is finalized, the follwing Checksums are saved into the database
482 1 Alexander Blum
    * Checksum of the Harddisk: Harddisk
483 1 Alexander Blum
    * Checksums of the Filesystem: Partition, Raid, Raid Sub, Crypto, LVM, Filesystem
484 1 Alexander Blum
* For each Content the following Checksums are saved to database
485 1 Alexander Blum
    * Checksum for the whole file
486 1 Alexander Blum
    * Checksums for each upload chunk
487 1 Alexander Blum
* All checksums are additionally saved on the harddisk as second indepentend source
488 1 Alexander Blum
    * Harddisk/Filesystem: on a metadata partition
489 1 Alexander Blum
    * File/Chunks: on the filesystem
490 1 Alexander Blum
* A checksum of the metadata partition is saved only on the harddisk
491 1 Alexander Blum
* Regularly schedulled (e.g. biannual) integrity tests ensure the logterm integrity of all Harddisks
492 1 Alexander Blum
    * For each test
493 1 Alexander Blum
        * Check of uuids
494 1 Alexander Blum
        * Check of checksum of metadata partition
495 1 Alexander Blum
        * Activation of crypto and raid
496 1 Alexander Blum
        * Check of checksums of filesystems /dev/disk/by-uuid/...
497 1 Alexander Blum
        * On error:
498 1 Alexander Blum
            * Feedback Admin
499 1 Alexander Blum
            * Write HarddiskTest (state: error_filesystem)
500 1 Alexander Blum
            * Check of Checksumms of all files
501 1 Alexander Blum
        * Write HarddiskTest (state: sane)
502 1 Alexander Blum
    * An error might be tracked down to the resolution of the uploads chunks, if neccessary
503 1 Alexander Blum
504 1 Alexander Blum
505 1 Alexander Blum
### Archiving of the files
506 1 Alexander Blum
507 1 Alexander Blum
* The files associated with a Content are stored on a filesystem
508 1 Alexander Blum
* Each file is named after the Content Uuid: `UUID`
509 1 Alexander Blum
* Each file is associated with two other files by convention:
510 1 Alexander Blum
    * `UUID.checksum`: checksum of the whole file
511 1 Alexander Blum
    * `UUID.checksums`: checksums of all upload chunks
512 1 Alexander Blum
* Each file is stored in the folder `./UUID[0]/UUID[1]/`
513 1 Alexander Blum
    * Folderstructure for optimized filesystemaccess
514 1 Alexander Blum
    * No semantic information (e.g. USER) to avoid the need for corrections
515 1 Alexander Blum
* The full syntax for a file is
516 1 Alexander Blum
    * `./UUID[0]/UUID[1]/UUID(.checksum(s))`
517 1 Alexander Blum
* Examples of file paths
518 1 Alexander Blum
    * `./3/5/35f0c169-6594-4bf3-b285-451d2aa8c61e`
519 1 Alexander Blum
520 1 Alexander Blum
521 1 Alexander Blum
### Orchestration of the archiving
522 1 Alexander Blum
523 1 Alexander Blum
* Files on an [[Specification#Intermediate-storage-for-archive|intermediate storage]] may be archived in many storehouses
524 1 Alexander Blum
* The URLs to the storehouse machines are stored in STOREHOUSECODE target files
525 1 Alexander Blum
* The mappings of all intermediate storages to storehouses are administered on an central orchestration machine
526 1 Alexander Blum
    * Syntax: `./STORAGE/STOREHOUSECODE`
527 1 Alexander Blum
    * Example: `./storage001/DÜ1`
528 1 Alexander Blum
* The mapping of a single intermediate storage to storehouses are mirrored to this intermediate storage
529 1 Alexander Blum
    * Syntax: `./storage/targets/STOREHOUSECODE`
530 1 Alexander Blum
    * Example: `./storage/targets/DÜ1`
531 1 Alexander Blum
* The files on the intermediate storage are synchronized with the files on the orchestration machine for orchestration
532 1 Alexander Blum
533 1 Alexander Blum
534 1 Alexander Blum
### Human readable label sticker
535 1 Alexander Blum
536 1 Alexander Blum
* For each Harddisk, a label sticker may be generated
537 1 Alexander Blum
* Header: `PURPOSE-STOREHOUSECODE-CONTAINERLABELCODE (RAIDTYPE: RAIDNUMBER/RAIDTOTAL)`
538 1 Alexander Blum
    * `PURPOSE`: "UCR" ("User Content Repertoire")
539 1 Alexander Blum
    * `STOREHOUSECODE`: Code of the Storehouse, e.g. "DÜ1" for the first storehouse in Düsseldorf
540 1 Alexander Blum
    * `CONTAINERLABELCODE`: Incremental, padded to 5 digits, e.g. 00001
541 1 Alexander Blum
    * `RAIDTYPE`: Typ of the raid
542 1 Alexander Blum
    * `RAIDNUMBER`: number of the harddisk in the raid
543 1 Alexander Blum
    * `RAIDTOTAL`: total number of harddisks in the raid
544 1 Alexander Blum
* Details: List of `ARCHIVELABELCODE`s
545 1 Alexander Blum
    * `ARCHIVELABELCODE`: Label of the Archive
546 1 Alexander Blum
547 1 Alexander Blum
# Sort
548 1 Alexander Blum
549 1 Alexander Blum
* Only more liberal licenses should be allowed.