SQL Server 2012 SP2 Replication Log Reader error v.11.00.5058

After moving Dynamics GP to a newly built server, using a back up and restore strategy, the Dynamics GP SQL replication kept causing problems. The following message drowned the error logs of the server. I ended up rebuilding replication configuration entirely to solve the issue, but there may be some clues in the steps I went through that might help someone else out there. There was lots more digging and prodding than summary shown below!

The last step did not log any message! (Source: MSSQL_REPL, Error number: MSSQL_REPL22037

image

It was possible to add text logging by going to the “Run Agent” step in the Log Reader SQL job and then adding to the list of the parameters, the following;

-Publisher [xxxxx] -PublisherDB [xxxxx] -Distributor [xxxxx] -DistributorSecurityMode 1 -Continuous -Output f:\SQLBackups\Reports\ReplicationOutputLog.txt

imageimage

The ReplicationOutputLog.txt then contained the following:

Validating publisher

DBMS: Microsoft SQL Server
Version: 11.00.5058
catalog name:
user name: dbo
API conformance: 0
SQL conformance: 0
transaction capable: 1
read only: F
identifier quote char: "
non_nullable_columns: 0
owner usage: 15
max table name len: 128
max column name len: 128
need long data len:
max columns in table: 1000
max columns in index: 16
max char literal len: 131072
max statement len: 131072
max row size: 131072
2015-02-06 08:54:59.278 OLE DB xxxxxx 'xxxxxxx': {call sp_MSensure_single_instance(N'Repl-LogReader-0-xxxx-19', 2)}
2015-02-06 08:54:59.278 Parameter values obtained from agent profile:
-pollinginterval 5000
-historyverboselevel 1
-logintimeout 15
-querytimeout 1800
-readbatchsize 500
-readbatchsize 500000
2015-02-06 08:54:59.278 Status: 0, code: 22037, text: 'The last step did not log any message!'.
2015-02-06 08:54:59.278 The last step did not log any message!

We were also getting lots of errors relating to “Only one Log Reader Agent or log-related procedure”…”can connect to a database at a time” (see below for screenshot)

image

Google revealed some posts around SP1/SP2 hotfixes and cumulative updates for SQL 2012 fixing issues around area of multiple log reader errors like this. Other posts talked about the database not being upgraded correctly, new schema differences between versions. My conclusion on reading these posts was that the SQL replication database (distribution), may have been brought over from the previous server (we don’t know the full history of if it was restored or/and replication rebuilt by scripts). The restored database may not have been correctly upgraded by whatever is in SP1/SP2, both of which were applied prior to the restore of our data and thus any operations included in SP1/SP2 would have not been applied against it due to the time line.

After a few evenings spent removing all I could find of relevance in the replication distribution database tables and clearing out and recreating log reader agent jobs and snapshot agent jobs, still problems were persisting. After countless recreations of the publications and subscriptions, it felt like there were remnants of previous versions replication clinging on deep in the bowels of replication (amazing how gut feeling develops with experience).

Failing in patience and for lack of a silver bullet, the solution was to remove the subscriptions and the publications, disable replication on the server. Then ensured the replication distribution database and its data files were gone. Also ensured no SQL jobs relating to replication were left behind (there were so removed them too). Also checked all the nodes under SSMS that relate to replication to ensure nothing was left at all, including on the subscriber (there was a shadow of the subscription left on the subscriber).

Then built replication a new, created a new distributor database, with a new name for peace of mind. Created new publication, added articles, created new subscribers.

After reinitializing all subscriptions and getting snapshots, everything started working how it should.

I feel lucky that I don’t have a totally mission critical replication setup. It is good that we have two copies of our replicated tables in different schemas, with views fronting them. This allows us to toggle between live and cached data modes, while “messing on with replication”.

Hey its working!

image

The only thing left to figure out is if the “one log reader” error is something to worry about, or perhaps it will simply go away with the next service pack, whenever that is…