Nacos 服務(wù)注冊源碼分析
本文我們一起以源碼的維度來分析 Nacos 做為服務(wù)注冊中心的服務(wù)注冊過程,我會以服務(wù)端、客戶端兩個角度來進行分析,Nacos 客戶端我主要是采用 spring-cloud-alibaba 作為核心的客戶端組件。對于 Nacos 服務(wù)端我會講解到, Nacos 如何實現(xiàn) AP/CP 兩種模式共存的,以及如何區(qū)分的。最后還會分享我在源碼調(diào)試過程中如何定位核心類的一點經(jīng)驗。
下面我先對我的環(huán)境做一個簡單的介紹:
- Jdk 1.8
- nacos-server-1.4.2
- spring-boot-2.3.5.RELEASE
- spring-cloud-Hoxton.SR8
- spring-cloiud-alibab-2.2.5.RELEASE
Nacos 服務(wù)架構(gòu)
以 Spring-Boot 為服務(wù)基礎(chǔ)搭建平臺, Nacos 在服務(wù)架構(gòu)中的位置如下圖所示:

總的來說和 Nacos 功能類似的中間件有 Eureka、Zookeeper、Consul 、Etcd 等。Nacos 最大的特點就是既能夠支持 AP、也能夠支持 CP 模式,在分區(qū)一致性方面使用的是 Raft 協(xié)議來實現(xiàn)。
Nacos 客戶端
服務(wù)注冊客戶端
添加依賴
Nacos 服務(wù)注冊是客戶端主動發(fā)起,利用 Spring 啟完成事件進行拓展調(diào)用服務(wù)注冊方法。首先我們需要導(dǎo)入spring-cloud-starter-alibaba-nacos-discovery依賴:
- <dependency>
- <groupId>com.alibaba.cloud</groupId>
- <artifactId>spring-cloud-starter-alibaba-nacos-discovery</artifactId>
- </dependency>
分析源碼
對于 spring-boot 組件我們首先先找它的 META-INF/spring.factories 文件
- org.springframework.boot.autoconfigure.EnableAutoConfiguration=\
- com.alibaba.cloud.nacos.discovery.NacosDiscoveryAutoConfiguration,\
- com.alibaba.cloud.nacos.ribbon.RibbonNacosAutoConfiguration,\
- com.alibaba.cloud.nacos.endpoint.NacosDiscoveryEndpointAutoConfiguration,\
- com.alibaba.cloud.nacos.registry.NacosServiceRegistryAutoConfiguration,\
- com.alibaba.cloud.nacos.discovery.NacosDiscoveryClientConfiguration,\
- com.alibaba.cloud.nacos.discovery.reactive.NacosReactiveDiscoveryClientConfiguration,\
- com.alibaba.cloud.nacos.discovery.configclient.NacosConfigServerAutoConfiguration,\
- com.alibaba.cloud.nacos.NacosServiceAutoConfiguration
- org.springframework.cloud.bootstrap.BootstrapConfiguration=\
- com.alibaba.cloud.nacos.discovery.configclient.NacosDiscoveryClientConfigServiceBootstrapConfiguration
通過我的分析發(fā)現(xiàn) NacosServiceRegistryAutoConfiguration 是咱們服務(wù)注冊的核心配置類,該類中定義了三個核心的 Bean 對象:
- NacosServiceRegistry
- NacosRegistration
- NacosAutoServiceRegistration
NacosAutoServiceRegistration
NacosAutoServiceRegistration 實現(xiàn)了服務(wù)向 Nacos 發(fā)起注冊的功能,它繼承自抽象類 AbstractAutoServiceRegistration 。
在抽象類 AbstractAutoServiceRegistration 中實現(xiàn) ApplicationContextAware、ApplicationListener
- public void onApplicationEvent(WebServerInitializedEvent event) {
- bind(event);
- }
再調(diào)用 bind(event) 方法:
- public void bind(WebServerInitializedEvent event) {
- ApplicationContext context = event.getApplicationContext();
- if (context instanceof ConfigurableWebServerApplicationContext) {
- if ("management".equals(((ConfigurableWebServerApplicationContext) context)
- .getServerNamespace())) {
- return;
- }
- }
- this.port.compareAndSet(0, event.getWebServer().getPort());
- this.start();
- }
然后調(diào)用 start() 方法
- public void start() {
- if (!isEnabled()) {
- if (logger.isDebugEnabled()) {
- logger.debug("Discovery Lifecycle disabled. Not starting");
- }
- return;
- }
- // only initialize if nonSecurePort is greater than 0 and it isn't already running
- // because of containerPortInitializer below
- if (!this.running.get()) {
- this.context.publishEvent(
- new InstancePreRegisteredEvent(this, getRegistration()));
- register();
- if (shouldRegisterManagement()) {
- registerManagement();
- }
- this.context.publishEvent(
- new InstanceRegisteredEvent<>(this, getConfiguration()));
- this.running.compareAndSet(false, true);
- }
- }
最后調(diào)用 register(); 在內(nèi)部去調(diào)用 serviceRegistry.register() 方法完成服務(wù)注冊。
- private final ServiceRegistry<R> serviceRegistry;
- protected void register() {
- this.serviceRegistry.register(getRegistration());
- }
NacosServiceRegistry
NacosServiceRegistry 類主要的目的就是實現(xiàn)服務(wù)注冊
- public void register(Registration registration) {
- if (StringUtils.isEmpty(registration.getServiceId())) {
- log.warn("No service to register for nacos client...");
- return;
- }
- // 默認情況下,會通過反射返回一個 `com.alibaba.nacos.client.naming.NacosNamingService` 的實例
- NamingService namingService = namingService();
- // 獲取 serviceId , 默認使用配置: spring.application.name
- String serviceId = registration.getServiceId();
- // 獲取 group , 默認 DEFAULT_GROUP
- String group = nacosDiscoveryProperties.getGroup();
- // 創(chuàng)建 instance 實例
- Instance instance = getNacosInstanceFromRegistration(registration);
- try {
- // 注冊實例
- namingService.registerInstance(serviceId, group, instance);
- log.info("nacos registry, {} {} {}:{} register finished", group, serviceId,
- instance.getIp(), instance.getPort());
- }
- catch (Exception e) {
- log.error("nacos registry, {} register failed...{},", serviceId,
- registration.toString(), e);
- // rethrow a RuntimeException if the registration is failed.
- // issue : https://github.com/alibaba/spring-cloud-alibaba/issues/1132
- rethrowRuntimeException(e);
- }
- }
我們可以看到最后調(diào)用的是 namingService.registerInstance(serviceId, group, instance); 方法。
- public void registerInstance(String serviceName, String groupName, Instance instance) throws NacosException {
- NamingUtils.checkInstanceIsLegal(instance);
- String groupedServiceName = NamingUtils.getGroupedName(serviceName, groupName);
- if (instance.isEphemeral()) {
- BeatInfo beatInfo = beatReactor.buildBeatInfo(groupedServiceName, instance);
- beatReactor.addBeatInfo(groupedServiceName, beatInfo);
- }
- serverProxy.registerService(groupedServiceName, groupName, instance);
- }
然后再調(diào)用 serverProxy.registerService(groupedServiceName, groupName, instance); 方法進行服務(wù)注冊,通過 beatReactor.addBeatinfo() 創(chuàng)建 schedule 每間隔 5s 向服務(wù)端發(fā)送一次心跳數(shù)據(jù)
- public void registerService(String serviceName, String groupName, Instance instance) throws NacosException {
- NAMING_LOGGER.info("[REGISTER-SERVICE] {} registering service {} with instance: {}", namespaceId, serviceName,
- instance);
- final Map<String, String> params = new HashMap<String, String>(16);
- params.put(CommonParams.NAMESPACE_ID, namespaceId);
- params.put(CommonParams.SERVICE_NAME, serviceName);
- params.put(CommonParams.GROUP_NAME, groupName);
- params.put(CommonParams.CLUSTER_NAME, instance.getClusterName());
- params.put("ip", instance.getIp());
- params.put("port", String.valueOf(instance.getPort()));
- params.put("weight", String.valueOf(instance.getWeight()));
- params.put("enable", String.valueOf(instance.isEnabled()));
- params.put("healthy", String.valueOf(instance.isHealthy()));
- params.put("ephemeral", String.valueOf(instance.isEphemeral()));
- params.put("metadata", JacksonUtils.toJson(instance.getMetadata()));
- // POST: /nacos/v1/ns/instance 進行服務(wù)注冊
- reqApi(UtilAndComs.nacosUrlInstance, params, HttpMethod.POST);
- }
服務(wù)注冊服務(wù)端
Nacos 做為服務(wù)注冊中心,既可以實現(xiàn)AP ,也能實現(xiàn) CP 架構(gòu)。來維護我們服務(wù)中心的服務(wù)列表。下面是我們服務(wù)列表一個簡單的數(shù)據(jù)模型示意圖:

其實就和咱們 NacosServiceRegistry#registry 構(gòu)建 Instance 實例的過程是一致的。繼續(xù)回到我們源碼分析我們直接來看服務(wù)端的 /nacos/v1/ns/instance 接口,被定義在 InstanceController#register 方法。
服務(wù)注冊
在 InstanceController#register 方法中,主要是解析 request 參數(shù)然后調(diào)用 serviceManager.registerInstance , 如果返回 ok 就表示注冊成功。
- @CanDistro
- @PostMapping
- @Secured(parser = NamingResourceParser.class, action = ActionTypes.WRITE)
- public String register(HttpServletRequest request) throws Exception {
- final String namespaceId = WebUtils
- .optional(request, CommonParams.NAMESPACE_ID, Constants.DEFAULT_NAMESPACE_ID);
- final String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME);
- NamingUtils.checkServiceNameFormat(serviceName);
- final Instance instance = parseInstance(request);
- serviceManager.registerInstance(namespaceId, serviceName, instance);
- return "ok";
- }
registerInstance 方法的調(diào)用
- public void registerInstance(String namespaceId, String serviceName, Instance instance) throws NacosException {
- createEmptyService(namespaceId, serviceName, instance.isEphemeral());
- Service service = getService(namespaceId, serviceName);
- if (service == null) {
- throw new NacosException(NacosException.INVALID_PARAM,
- "service not found, namespace: " + namespaceId + ", service: " + serviceName);
- }
- addInstance(namespaceId, serviceName, instance.isEphemeral(), instance);
- }
再調(diào)用 addInstance() 方法
- @Resource(name = "consistencyDelegate")
- private ConsistencyService consistencyService;
- public void addInstance(String namespaceId, String serviceName, boolean ephemeral, Instance... ips)
- throws NacosException {
- String key = KeyBuilder.buildInstanceListKey(namespaceId, serviceName, ephemeral);
- Service service = getService(namespaceId, serviceName);
- synchronized (service) {
- List<Instance> instanceList = addIpAddresses(service, ephemeral, ips);
- Instances instances = new Instances();
- instances.setInstanceList(instanceList);
- consistencyService.put(key, instances);
- }
- }
調(diào)用 consistencyService.put(key, instances); 刷新 service 中的所有 instance。我們通過 consistencyService 的定義可以知道它將調(diào)用 DelegateConsistencyServiceImpl 類的 put 方法。在這個地方有一個 AP/CP 模式的選擇我們可以通過
- @Override
- public void put(String key, Record value) throws NacosException {
- mapConsistencyService(key).put(key, value);
- }
- // AP 或者 CP 模式的選擇, AP 模式采用 Distro 協(xié)議, CP 模式采用 Raft 協(xié)議。
- private ConsistencyService mapConsistencyService(String key) {
- return KeyBuilder.matchEphemeralKey(key) ? ephemeralConsistencyService : persistentConsistencyService;
- }
AP 模式
Nacos 默認就是采用的 AP 模式使用 Distro 協(xié)議實現(xiàn)。實現(xiàn)的接口是 EphemeralConsistencyService 對節(jié)點信息的持久化主要是調(diào)用 put 方法
- @Override
- public void put(String key, Record value) throws NacosException {
- // 數(shù)據(jù)持久化
- onPut(key, value);
- // 通知其他服務(wù)節(jié)點
- distroProtocol.sync(new DistroKey(key, KeyBuilder.INSTANCE_LIST_KEY_PREFIX), DataOperation.CHANGE,
- globalConfig.getTaskDispatchPeriod() / 2);
- }
在調(diào)用 doPut 來保存數(shù)據(jù)并且發(fā)通知
- public void onPut(String key, Record value) {
- if (KeyBuilder.matchEphemeralInstanceListKey(key)) {
- Datum<Instances> datum = new Datum<>();
- datum.value = (Instances) value;
- datum.key = key;
- datum.timestamp.incrementAndGet();
- // 數(shù)據(jù)持久化
- dataStore.put(key, datum);
- }
- if (!listeners.containsKey(key)) {
- return;
- }
- notifier.addTask(key, DataOperation.CHANGE);
- }
在 notifier.addTask 主要是通過 tasks.offer(Pair.with(datumKey, action)); 向阻塞隊列 tasks 中放注冊實例信息。通過 Notifier#run 方法來進行異步操作以保證效率
- public class Notifier implements Runnable {
- @Override
- public void run() {
- Loggers.DISTRO.info("distro notifier started");
- for (; ; ) {
- try {
- Pair<String, DataOperation> pair = tasks.take();
- handle(pair);
- } catch (Throwable e) {
- Loggers.DISTRO.error("[NACOS-DISTRO] Error while handling notifying task", e);
- }
- }
- }
- private void handle(Pair<String, DataOperation> pair) {
- // 省略部分代碼
- for (RecordListener listener : listeners.get(datumKey)) {
- count++;
- try {
- if (action == DataOperation.CHANGE) {
- listener.onChange(datumKey, dataStore.get(datumKey).value);
- continue;
- }
- if (action == DataOperation.DELETE) {
- listener.onDelete(datumKey);
- continue;
- }
- } catch (Throwable e) {
- Loggers.DISTRO.error("[NACOS-DISTRO] error while notifying listener of key: {}", datumKey, e);
- }
- }
- }
- }
如果是 DataOperation.CHANGE 類型的事件會調(diào)用 listener.onChange(datumKey, dataStore.get(datumKey).value); 其實我們的 listener 就是我們的 Service 對象。
- public void onChange(String key, Instances value) throws Exception {
- Loggers.SRV_LOG.info("[NACOS-RAFT] datum is changed, key: {}, value: {}", key, value);
- for (Instance instance : value.getInstanceList()) {
- if (instance == null) {
- // Reject this abnormal instance list:
- throw new RuntimeException("got null instance " + key);
- }
- if (instance.getWeight() > 10000.0D) {
- instance.setWeight(10000.0D);
- }
- if (instance.getWeight() < 0.01D && instance.getWeight() > 0.0D) {
- instance.setWeight(0.01D);
- }
- }
- updateIPs(value.getInstanceList(), KeyBuilder.matchEphemeralInstanceListKey(key));
- recalculateChecksum();
- }
updateIPs 方法會將服務(wù)實例信息,更新到注冊表的內(nèi)存中去,并且會以 udp 的方式通知當(dāng)前服務(wù)的訂閱者。
- public void updateIPs(Collection<Instance> instances, boolean ephemeral) {
- Map<String, List<Instance>> ipMap = new HashMap<>(clusterMap.size());
- for (String clusterName : clusterMap.keySet()) {
- ipMap.put(clusterName, new ArrayList<>());
- }
- for (Instance instance : instances) {
- try {
- if (instance == null) {
- Loggers.SRV_LOG.error("[NACOS-DOM] received malformed ip: null");
- continue;
- }
- if (StringUtils.isEmpty(instance.getClusterName())) {
- instance.setClusterName(UtilsAndCommons.DEFAULT_CLUSTER_NAME);
- }
- if (!clusterMap.containsKey(instance.getClusterName())) {
- Loggers.SRV_LOG
- .warn("cluster: {} not found, ip: {}, will create new cluster with default configuration.",
- instance.getClusterName(), instance.toJson());
- Cluster cluster = new Cluster(instance.getClusterName(), this);
- cluster.init();
- getClusterMap().put(instance.getClusterName(), cluster);
- }
- List<Instance> clusterIPs = ipMap.get(instance.getClusterName());
- if (clusterIPs == null) {
- clusterIPs = new LinkedList<>();
- ipMap.put(instance.getClusterName(), clusterIPs);
- }
- clusterIPs.add(instance);
- } catch (Exception e) {
- Loggers.SRV_LOG.error("[NACOS-DOM] failed to process ip: " + instance, e);
- }
- }
- for (Map.Entry<String, List<Instance>> entry : ipMap.entrySet()) {
- //make every ip mine
- List<Instance> entryIPs = entry.getValue();
- // 更新服務(wù)列表
- clusterMap.get(entry.getKey()).updateIps(entryIPs, ephemeral);
- }
- setLastModifiedMillis(System.currentTimeMillis());
- // 推送服務(wù)訂閱者消息
- getPushService().serviceChanged(this);
- StringBuilder stringBuilder = new StringBuilder();
- for (Instance instance : allIPs()) {
- stringBuilder.append(instance.toIpAddr()).append("_").append(instance.isHealthy()).append(",");
- }
- Loggers.EVT_LOG.info("[IP-UPDATED] namespace: {}, service: {}, ips: {}", getNamespaceId(), getName(),
- stringBuilder.toString());
- }
CP 模式
Nacos 默認就是采用的 CP 模式使用 Raft 協(xié)議實現(xiàn)。實現(xiàn)類是 PersistentConsistencyServiceDelegateImpl
首先我們先看他的 put 方法
- public void put(String key, Record value) throws NacosException {
- checkIsStopWork();
- try {
- raftCore.signalPublish(key, value);
- } catch (Exception e) {
- Loggers.RAFT.error("Raft put failed.", e);
- throw new NacosException(NacosException.SERVER_ERROR, "Raft put failed, key:" + key + ", value:" + value,
- e);
- }
- }
調(diào)用 raftCore.signalPublish(key, value); 主要的步驟如下
- 判斷是否是 Leader 節(jié)點,如果不是 Leader 節(jié)點將請求轉(zhuǎn)發(fā)給 Leader 節(jié)點處理;
- 如果是 Leader 節(jié)點,首先執(zhí)行 onPublish(datum, peers.local()); 方法,內(nèi)部首先通過 raftStore.updateTerm(local.term.get()); 方法持久化到文件,然后通過 NotifyCenter.publishEvent(ValueChangeEvent.builder().key(datum.key).action(DataOperation.CHANGE).build());異步更新到內(nèi)存;
- 通過 CountDownLatch 實現(xiàn)了一個過半機制 new CountDownLatch(peers.majorityCount()) 只有當(dāng)成功的節(jié)點大于 N/2 + 1 的時候才返回成功。
- 調(diào)用其他的 Nacos 節(jié)點的 /raft/datum/commit 同步實例信息。
- public void signalPublish(String key, Record value) throws Exception {
- if (stopWork) {
- throw new IllegalStateException("old raft protocol already stop work");
- }
- if (!isLeader()) {
- ObjectNode params = JacksonUtils.createEmptyJsonNode();
- params.put("key", key);
- params.replace("value", JacksonUtils.transferToJsonNode(value));
- Map<String, String> parameters = new HashMap<>(1);
- parameters.put("key", key);
- final RaftPeer leader = getLeader();
- raftProxy.proxyPostLarge(leader.ip, API_PUB, params.toString(), parameters);
- return;
- }
- OPERATE_LOCK.lock();
- try {
- final long start = System.currentTimeMillis();
- final Datum datum = new Datum();
- datum.key = key;
- datum.value = value;
- if (getDatum(key) == null) {
- datum.timestamp.set(1L);
- } else {
- datum.timestamp.set(getDatum(key).timestamp.incrementAndGet());
- }
- ObjectNode json = JacksonUtils.createEmptyJsonNode();
- json.replace("datum", JacksonUtils.transferToJsonNode(datum));
- json.replace("source", JacksonUtils.transferToJsonNode(peers.local()));
- onPublish(datum, peers.local());
- final String content = json.toString();
- final CountDownLatch latch = new CountDownLatch(peers.majorityCount());
- for (final String server : peers.allServersIncludeMyself()) {
- if (isLeader(server)) {
- latch.countDown();
- continue;
- }
- final String url = buildUrl(server, API_ON_PUB);
- HttpClient.asyncHttpPostLarge(url, Arrays.asList("key", key), content, new Callback<String>() {
- @Override
- public void onReceive(RestResult<String> result) {
- if (!result.ok()) {
- Loggers.RAFT
- .warn("[RAFT] failed to publish data to peer, datumId={}, peer={}, http code={}",
- datum.key, server, result.getCode());
- return;
- }
- latch.countDown();
- }
- @Override
- public void onError(Throwable throwable) {
- Loggers.RAFT.error("[RAFT] failed to publish data to peer", throwable);
- }
- @Override
- public void onCancel() {
- }
- });
- }
- if (!latch.await(UtilsAndCommons.RAFT_PUBLISH_TIMEOUT, TimeUnit.MILLISECONDS)) {
- // only majority servers return success can we consider this update success
- Loggers.RAFT.error("data publish failed, caused failed to notify majority, key={}", key);
- throw new IllegalStateException("data publish failed, caused failed to notify majority, key=" + key);
- }
- long end = System.currentTimeMillis();
- Loggers.RAFT.info("signalPublish cost {} ms, key: {}", (end - start), key);
- } finally {
- OPERATE_LOCK.unlock();
- }
- }
判斷 AP 模式還是 CP 模式
如果注冊 nacos 的 client 節(jié)點注冊時 ephemeral=true,那么 nacos 集群對這個 client 節(jié)點的效果就是 ap 的采用 distro,而注冊nacos 的 client 節(jié)點注冊時 ephemeral=false,那么nacos 集群對這個節(jié)點的效果就是 cp 的采用 raft。根據(jù) client 注冊時的屬性,ap,cp 同時混合存在,只是對不同的 client 節(jié)點效果不同
Nacos 源碼調(diào)試
Nacos 啟動文件
首先我們需要找到 Nacos 的啟動類,首先需要找到啟動的 jar.

然后我們在解壓 target/nacos-server.jar
解壓命令:
- # 解壓 jar 包
- tar -zxvf nacos-server.jar
- # 查看 MANIFEST.MF 內(nèi)容
- cat META-INF/MANIFEST.MF
- Manifest-Version: 1.0
- Implementation-Title: nacos-console 1.4.2
- Implementation-Version: 1.4.2
- Archiver-Version: Plexus Archiver
- Built-By: xiweng.yy
- Spring-Boot-Layers-Index: BOOT-INF/layers.idx
- Specification-Vendor: Alibaba Group
- Specification-Title: nacos-console 1.4.2
- Implementation-Vendor-Id: com.alibaba.nacos
- Spring-Boot-Version: 2.5.0-RC1
- Implementation-Vendor: Alibaba Group
- Main-Class: org.springframework.boot.loader.PropertiesLauncher
- Spring-Boot-Classpath-Index: BOOT-INF/classpath.idx
- Start-Class: com.alibaba.nacos.Nacos
- Spring-Boot-Classes: BOOT-INF/classes/
- Spring-Boot-Lib: BOOT-INF/lib/
- Created-By: Apache Maven 3.6.3
- Build-Jdk: 1.8.0_231
- Specification-Version: 1.4.2
通過 MANIFEST.MF 中的配置信息,我們可以找到 Start-Class 這個配置這個類就是 Spring-Boot 項目的啟動類 com.alibaba.nacos.Nacos
Nacos 調(diào)試
通過 com.alibaba.nacos.Nacos 的啟動類,我們可以通過這個類在 Idea 中進行啟動,然后調(diào)試。
本文轉(zhuǎn)載自微信公眾號「運維開發(fā)故事」,可以通過以下二維碼關(guān)注。轉(zhuǎn)載本文請聯(lián)系運維開發(fā)故事公眾號。