One primary volume and an unknown number of subsidiary volumes make up each Carbonio installation. The secondary volumes are managed and transferred between using the Carbonio Storages module.
Using Hierarchical Storage Management (Hierarchical Storage Management), a policy-based method, objects may be relocated in accordance with: One of the most practical is, for instance, to set aside the fastest storage for intense I/O operations and for often accessed data, while managing older data with the slower storage.
The remaining paragraphs of this part go with policies, HSM, other advanced approaches, and volumes and their management.
The Foundations of Carbonio Stores: Store Types and Their Functions
Carbonio permits two distinct kinds of stores:
Catalogue Store
a repository that houses your data’s metadata and is utilised by Apache Lucene to do indexing and search.
Data Centre
a location where all of your Carbonio data is stored and is arranged in a MySql database.
However, only one Index Store, one Primary Data Store, and one Secondary Data Store may be designated as Current, meaning that these are the ones that Carbonio is presently using. You can have more than one store of each kind.
Secondary and Primary Data Stores
Carbonio’s data stores come in two flavours: primary data stores and secondary data stores.
- Deduplication of the item fails.
- The only storage type that may be utilised centrally is S3 buckets.
Configuring Centralised Storage
- Use the command doCreateBucket to create an S3 bucket.
- S3 as the bucket type
- The bucket name, BucketName, must match the name on the remote provider exactly for the command to succeed.
- The remote username is X58Y54E5687R543.
- The remote password is abCderT577eDfjhf.
- The bucket is given the label My_New_Bucket.
- The bucket connection endpoint for Carbonio Storages is https://example_bucket_provider.com.
- Test the connection using the bucket ID (60b8139c-d56f-4012-a928-4b6182756301) that was received in the previous step:
- Make a volume on the first AppServer connected to the bucket:
- The type of bucket, S3
- The volume name defined on the server where the command is run is Store_01.
- secondary: the volume’s kind
- The bucket ID as received in step 1 is 60b8139c-d56f-4012-a928-4b6182756301.
- volume_prefix A designation given to the volume, such as main_vol, is utilised for rapid searches.
- True: The volume is centralised and accessible to several AppServers.
- The type of bucket, S3
- The volume name defined on the server where the command is run is Store_01.
- secondary: the volume’s kind
- After the Centralised Volume has been built, it must be added to the volume list and its configuration copied from the first server to all mailbox servers. Run the following commands on every other AppServer to do this:
- The type of bucket, S3
- The volume name defined on the server where the command is run is Store_01
- The _servername_ of the server on which the volume was specified and created is mailbox_01.example.com.
Volumes by Carbonio
Volume Characteristics
- Name: The volume’s special identification number
- The data’s intended saving location is designated by a route. On this route, the zextras user has to have r/w rights.
- File compression can be turned on or off for the volume.
- Compression Threshold: The smallest file size at which compression begins. Even if compression is enabled, files that are smaller than this size will never be compressed.
- Current: A current volume is one that will have HSM policy application (secondary current) or data written to it immediately upon arrival (primary current).
Regional Volumes
Regardless of where the mountpoint is located, Local Volumes (i.e., FileBlob type) can be hosted on any mountpoint on the system and are determined by the following properties:
- Name: The volume’s special identification number
- The data’s intended saving location is designated by a route. This location requires r/w permissions for the zextras user.
- File compression can be turned on or off for the volume.
- Compression Threshold: the smallest file size at which compression begins. Even if compression is enabled, files that are smaller than this size will never be compressed.
Actual Volumes
Managing Volumes using Carbonio Storages
- (Local) FileBlob
- Alibaba
- Ceph
- OpenIO
- Swift
- Cloudian (object storage that is S3 compliant)
- Amazon and any other S3-compatible solution are not expressly supported by S3.
- Scalability (object storage S3 compatible)
- EMC (object storage that is S3 compatible)
The Hierarchical Storage Management Technique, or Custom S3 Hierarchical Storage Management
- More is spent on quick storage.
- Slow storage has lower expenses.
- In comparison to new data, old data will be accessed significantly less frequently.
Volumes, Stores, and Policies
- Primary storage: Your data is originally deposited in this quick-but-expensive storage.
- Secondary Store: The expensive yet sluggish storage location where older data will be sent.
Changing Products Between Stores
- The Secondary Store receives a copy of every Blob of the products discovered in the first phase.
- The duplicated objects’ database records are changed to reflect the relocation.
- The old Blobs are removed from the Primary Store if (and only if) the second and third phases are successfully performed.
- Since each step of the Move operation is stateful and is only carried out if the one before it has been properly completed, there is zero chance that any data will be lost.
DoMoveBlobs: Carbonio Storages’ DoMoveBlobs Operation
- The Blob is copied to the current secondary store.
- To inform Carbonio of the item’s new location, the Carbonio Database is updated.
- The primary store’s current copy of the original Blob is removed.
- via CLI
- By use of scheduling
What is a policy in terms of policy management?
Policy Case Studies
Establishing a Policy
The majority of your data may be stored in secure and long-lasting cloud storage thanks to
Carbonio Storages and S3 buckets,
Services compatible with S3
- (Standard Local Volume) FileBlob
- S3 Amazon
- EMC
- OpenIO
- Swift
- S3 Scality
- Cloudian
- Any unsupported S3-compliant solution that is custom
The “Incoming” Directory and Primary Volumes
Regional Cache
- an S3 container. To utilise the bucket, you must be aware of its name and location.
- Access Key and Secret of a user
- a rule that gives the user complete access to your bucket.
Management of buckets
Amazon S3 Bucket of Tips
User
Rights Administration
The definition of item deduplication
Deduplication of items with Carbonio storage systems
Volume Deduplication being used
- Current Pass (Prefix for Digest): Based on the first character of their digests (name), the BLOBS will be analysed by the doDeduplicate command in groups.
- Number of mailboxes that were checked during the current pass.
- Deduplicated/duplicated Blobs: Total number of duplicated items on the volume / Number of BLOBS deduplicated by the current operation
- Number of already deduplicated blobs (duplicated blobs that have already undergone a prior run) on the volume
- Blobs that have not been examined, typically as a result of a read error or a missing file, are known as skipped blobs.
- Invalid Digests: BLOBs having incorrect digests (names that differ from the file’s real digest).
- Amount of disc space saved overall with the doDeduplicate process.
- On the last mailbox, the procedure is now on the second-to-last pass.
- 137089 duplicated BLOBs have been discovered, of which 71178 have already undergone deduplication.
- 64868 BLOBs were deduplicated in the current operation, saving a total of 21.88GB in disc space.