User Tools

Site Tools


middleware:devel:ed:replication

LDAP Based Registry Replication

Our current production Registry replication solution uses HornetQ to propagate replication messages to downstream clients. This solution has the following steps:

  1. When a change occurs in a manage bean, enqueue a message indicating the record that changed. The message priority can be set here if desired.
  2. These changes typically take place on apps nodes, in which the message is sent via a HornetQ bridge to the msg machine.
  3. A Message Driven Bean on msg, RegistryBridgeBean, receives the messages from the queue, determines the type, builds a record, and calls RegistryReplBean.
  4. RegistryReplBean then performs XSLT transforms for each downstream client and enqueues that SPML record to the client's queue.
  5. A Client Replication MDB receives the SPML record, processes it as necessary, and updates the client system.

ED-LDAP, mail, Google, and AD replication all work this way.

Proposed Solution

HornetQ has historically worked well for us, but it also has its problems including strange locking, configuration directives changing, buffering that could cause priority records to not be replicated efficiently, and poor logging. When developing ED-MW, the question was raised if dealing with HornetQ going forward was worth the effort.

Our group has expertise in LDAP from both client and server side perspectives, and the replication technology that we use for our master and slave (producer and consumer) LDAP machines has proven to be reliable and fast. This technology, syncrepl, allows a client to perform a sync search against an LDAP. When records are changed in the LDAP, those changes will be reflected in an accesslog, and those entries will be sent to the client that requested the sync search. The client can then determine if it cares about the record that has changed and take action accordingly.

The benefits of such a solution include better throughput (don't have to deal with transforms), dealing with LDAP records directly instead of something text-based (XML), improved logging, and better ability to determine the state of a repl client, including if it is up-to-date or behind.

New System Components

  1. VT Repl LDAP - Directory that clients monitor for replication changes.
  2. Registry “collector” table that contains list of changed records. (Simple message queue.)

New Replication Steps

  1. When a change occurs in a manage bean, add a row to a collector table in the Registry indicating the record that changed. The message channel can be set here if desired.
  2. A process on msg (analogous to RegistryBridgeBean) then selects a batch of rows from the table, ordered by channel, uid, and type. For each row, get the record and LDAP ADD it to the VT Repl LDAP. If there are redundant uids/types, only the highest channel will be processed; the others will be skipped. The rows in the batch are deleted in this step.
  3. If an LDAP write occurs in step 2, that change will be logged in the cn=accesslog as an auditWriteObject.
  4. Each downstream client has a repl client that is simply a syncrepl client against the VT Repl LDAP cn=accesslog. When a write occurs in (3), the syncrepl search will receive the auditWriteObject for the change.
  5. When auditWriteObjects are received, the client will determine if it is a record it needs to handle. If so, it will look up the entry from the VT Repl LDAP and process it for its downstream client.

Handling Priority

JMS has the concept of priority for messages, which allows higher priority messages to be delivered before lower priority messages.

LDAP syncrepl has no conception of priority; messages are sent in the order they are written. Though we will be able to process more records per second than HornetQ which will hopefully result in less wait times for updates, we still would like some changes to make it through faster than others. These are typically account related things like passwords.

Instead of message priority, we are going to use “channels”. The idea is that most changes can go through the standard channel at what we would previously refer to as normal priority. Account changes or changes that we need to replicate quickly can go through another channel named “accounts” or similar.

To implement the idea of channels, each client repl will perform a syncrepl search for the configured channels. We will start with 2 channels, but this can be extended to any number of channels.

It should be noted that things in the account channel will only be processed faster if there are generally more records sent through the standard channel.

Repl Client Details

Each repl client needs the following:

  1. A filter for the syncrepl search, most likely in the form:
    (&(objectClass=auditWriteObject)(reqMod=channel:= 2))" # specify channel
    (&(objectClass=auditWriteObject))" # all channels
  2. An entry in the VT Repl LDAP to store the sync cookie. This allows a repl client to keep track of its state.
  3. A list of the OUs that the client cares about (e.g. People, Services, Groups, Entitlements, Mail, Addresses, HealthChecks).
  4. The objectClasses that correspond to these OUs. This will be discussed further in the LDAP schema section. With the objectClass we can request attributes like @virginiaTechPerson to get only those attributes the client cares about.

Locking will be used to ensure that certain channels can take precedence when processing records.

Schema

Database

CREATE TABLE "VTREGISTRY"."VTREPLICATION" (
  "UID" NUMBER(19,0) NOT NULL ENABLE,
  "TYPE" VARCHAR2(20) NOT NULL ENABLE,
  "CHANNEL" NUMBER(2,0) NOT NULL ENABLE,
  "CREATED_DATE" TIMESTAMP NOT NULL ENABLE,
);

LDAP

vtReplPerson, vtReplGroup, vtReplService, vtReplEntitlement, vtReplAddress, vtReplMail, vtReplHealthCheck

Corresponding objectClasses for clients to request specific attributes.

TODO: add schema here

LDAP Details

Performance

middleware/devel/ed/replication.txt · Last modified: 2015/06/01 12:02 (external edit)