How to purge Zookeeper Transaction Log and Snapshots

How to purge Zookeeper Transaction Log and Snapshots

Goal:

How to purge Zookeeper Transaction Log and Snapshots.

Env:

Zookeeper 3.4.5 on MapR 5.1

Solution:

As Zookeeper Administrator's Guide mentions:
A ZooKeeper server will not remove old snapshots and log files when using the default configuration.

Hadoop Admin should be responsible for taking care of the log pruning.
Here are 2 ways -- Manual and Auto ways.

1. Manual Way

Schedule a cronjob to run below command on the machines where Zookeeper is running.
For example, keep only 3 versions(3 is minimum):
1
java -cp /opt/mapr/zookeeper/zookeeper-3.4.5/zookeeper-3.4.5-mapr-1503.jar:/opt/mapr/zookeeper/zookeeper-3.4.5/lib/slf4j-api-1.6.1.jar:/opt/mapr/zookeeper/zookeeper-3.4.5/lib/slf4j-log4j12-1.6.1.jar:/opt/mapr/zookeeper/zookeeper-3.4.5/lib/log4j-1.2.15.jar org.apache.zookeeper.server.PurgeTxnLog /opt/mapr/zkdata /opt/mapr/zkdata -n 3
Here are the outputs before and after this command is manually run:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
[root@v7 version-2]# ls -altr
total 10060
-rw-r--r-- 1 mapr mapr   136941 Mar 10  2016 snapshot.2000001aa
-rw-r--r-- 1 mapr mapr      296 Mar 10  2016 snapshot.0
-rw-r--r-- 1 mapr mapr 67108880 Mar 10  2016 log.100000001
-rw-r--r-- 1 mapr mapr 67108880 Jan 11 17:32 log.2000001ab
-rw-r--r-- 1 mapr mapr   204060 Jan 11 18:36 snapshot.200001e6b
-rw-r--r-- 1 mapr mapr 67108880 Jan 11 18:37 log.300000001
drwxr-x--- 3 mapr mapr     4096 Jan 11 18:38 ..
-rw-r--r-- 1 mapr mapr   204060 Jan 11 18:38 snapshot.300000012
-rw-r--r-- 1 mapr mapr        1 Jan 11 18:38 acceptedEpoch
-rw-r--r-- 1 mapr mapr        1 Jan 11 18:38 currentEpoch
-rw-r--r-- 1 mapr mapr 67108880 Jan 11 18:38 log.400000001
-rw-r--r-- 1 mapr mapr   204060 Jan 11 18:38 snapshot.40000000b
-rw-r--r-- 1 mapr mapr 67108880 Jan 11 18:40 log.40000000d
-rw-r--r-- 1 mapr mapr   204391 Jan 11 18:40 snapshot.40000001d
-rw-r--r-- 1 mapr mapr 67108880 Jan 11 18:40 log.40000001f
-rw-r--r-- 1 mapr mapr   204391 Jan 11 18:40 snapshot.40000002e
drwxr-xr-x 2 mapr mapr     4096 Jan 11 18:40 .
-rw-r--r-- 1 mapr mapr 67108880 Jan 11 18:40 log.400000030
[root@v7 version-2]# java -cp /opt/mapr/zookeeper/zookeeper-3.4.5/zookeeper-3.4.5-mapr-1503.jar:/opt/mapr/zookeeper/zookeeper-3.4.5/lib/slf4j-api-1.6.1.jar:/opt/mapr/zookeeper/zookeeper-3.4.5/lib/slf4j-log4j12-1.6.1.jar:/opt/mapr/zookeeper/zookeeper-3.4.5/lib/log4j-1.2.15.jar org.apache.zookeeper.server.PurgeTxnLog /opt/mapr/zkdata /opt/mapr/zkdata -n 2
Exception in thread "main" java.lang.IllegalArgumentException: count should be greater than 3
 at org.apache.zookeeper.server.PurgeTxnLog.purge(PurgeTxnLog.java:65)
 at org.apache.zookeeper.server.PurgeTxnLog.main(PurgeTxnLog.java:131)
[root@v7 version-2]# java -cp /opt/mapr/zookeeper/zookeeper-3.4.5/zookeeper-3.4.5-mapr-1503.jar:/opt/mapr/zookeeper/zookeeper-3.4.5/lib/slf4j-api-1.6.1.jar:/opt/mapr/zookeeper/zookeeper-3.4.5/lib/slf4j-log4j12-1.6.1.jar:/opt/mapr/zookeeper/zookeeper-3.4.5/lib/log4j-1.2.15.jar org.apache.zookeeper.server.PurgeTxnLog /opt/mapr/zkdata /opt/mapr/zkdata -n 3
log4j:WARN No appenders could be found for logger (org.apache.zookeeper.server.persistence.FileTxnSnapLog).
log4j:WARN Please initialize the log4j system properly.
Removing file: Jan 11, 2017 6:37:14 PM /opt/mapr/zkdata/version-2/log.300000001
Removing file: Jan 11, 2017 5:32:36 PM /opt/mapr/zkdata/version-2/log.2000001ab
Removing file: Mar 10, 2016 5:44:09 PM /opt/mapr/zkdata/version-2/log.100000001
Removing file: Mar 10, 2016 3:31:37 PM /opt/mapr/zkdata/version-2/snapshot.0
Removing file: Jan 11, 2017 6:36:51 PM /opt/mapr/zkdata/version-2/snapshot.200001e6b
Removing file: Jan 11, 2017 6:38:05 PM /opt/mapr/zkdata/version-2/snapshot.300000012
Removing file: Mar 10, 2016 10:27:48 AM /opt/mapr/zkdata/version-2/snapshot.2000001aa
[root@v7 version-2]# ls -altr
total 648
drwxr-x--- 3 mapr mapr     4096 Jan 11 18:38 ..
-rw-r--r-- 1 mapr mapr        1 Jan 11 18:38 acceptedEpoch
-rw-r--r-- 1 mapr mapr        1 Jan 11 18:38 currentEpoch
-rw-r--r-- 1 mapr mapr 67108880 Jan 11 18:38 log.400000001
-rw-r--r-- 1 mapr mapr   204060 Jan 11 18:38 snapshot.40000000b
-rw-r--r-- 1 mapr mapr 67108880 Jan 11 18:40 log.40000000d
-rw-r--r-- 1 mapr mapr   204391 Jan 11 18:40 snapshot.40000001d
-rw-r--r-- 1 mapr mapr 67108880 Jan 11 18:40 log.40000001f
-rw-r--r-- 1 mapr mapr   204391 Jan 11 18:40 snapshot.40000002e
-rw-r--r-- 1 mapr mapr 67108880 Jan 11 18:40 log.400000030
drwxr-xr-x 2 mapr mapr     4096 Jan 11 18:49 .

2. Auto Way

Set below 2 parameters in zoo.cfg and restart zookeeper.
autopurge.snapRetainCount
   New in 3.4.0: When enabled, ZooKeeper auto purge feature retains the autopurge.snapRetainCount most recent snapshots and the corresponding transaction logs in the dataDir and dataLogDir respectively and deletes the rest. Defaults to 3. Minimum value is 3.

autopurge.purgeInterval
    New in 3.4.0: The time interval in hours for which the purge task has to be triggered. Set to a positive integer (1 and above) to enable the auto purging. Defaults to 0.

Commentaires

Posts les plus consultés de ce blog

Controlling Parallelism in Spark by controlling the input partitions by controlling the input partitions

Spark performance optimization: shuffle tuning

Spark optimization