Проблема Dhcpd and failover (как быть с сообщениям uid lease is duplicate)

tutitu · Сообщение **tutitu** » 14.08.2008 16:36

Просьба помочь может, кто сталкивался или может у кого-то есть какие-то предложения.

Два сервера - система compact 3.0.4, dhcp-server 3.0.5.
Конфиг серверов:

Код: Выделить всё

 #dhcp1
#Primary

ddns-update-style none;
one-lease-per-client true;
default-lease-time 300;
max-lease-time 300;
authoritative;

failover peer "dhcp" {
  primary;
    address 10.0.2.3;
      port 847;
        peer address 10.0.2.2;
          peer port 647;
            max-response-delay 60;
              max-unacked-updates 10;
                mclt 600;
                  split 128;
                    load balance max seconds 3;
                    }
                    include "/etc/dhcp/master1.conf";

#dhcp2
#Secondary

ddns-update-style none;
one-lease-per-client true;
default-lease-time 300;
 max-lease-time 300;
authoritative;

failover peer "dhcp" {
  secondary;
    address 10.0.2.2;
      port 647;
        peer address 10.0.2.3;
          peer port 847;
            max-response-delay 60;
              max-unacked-updates 10;

                    load balance max seconds 3;
                    }
                    include "/etc/dhcp/master1.conf";

#Master1.conf
        subnet 10.0.2.0 netmask 255.255.252.0 { #
                pool{

        failover peer "dhcp";

        range 10.0.2.4 10.0.5.254; #
                deny dynamic bootp clients;
        }
        option routers 10.0.2.1;
                option domain-name "name.com"; #
                option domain-name-servers 10.0.0.9; #
                option subnet-mask 255.255.252.0; #

                }

Конфигурация dhcpd & failover работает нормально – со стороны клиента поддерживает выдачу ip addessov нормально в разных ”режимах failover”.
После запуска серверов в dhcpd.leases видно статус обоих серверов normal, подключившыеся клиенты получают ip. После истечение некоторого времени со стороны клиента делаешь release/renew - клиент обновляется (получает тот же ip c первого сервера и время договора), но в логах(другого сервера по этому клиенту, эго маку, появляется сообщения о дублировании на другой ip):

Код: Выделить всё

  Aug 14 20:31:07 xxxx dhcpd: uid lease 10.0.2.249 for client xx:xx:xx:xx:xx:xx is duplicate on 10.0.2.0/22

Заранее спасибо за любую помощь.

BAF · Сообщение **BAF** » 29.01.2014 07:41

tutitu писал(а): ↑
14.08.2008 16:36
Просьба помочь может, кто сталкивался или может у кого-то есть какие-то предложения.
Код: Выделить всё
  Aug 14 20:31:07 xxxx dhcpd: uid lease 10.0.2.249 for client xx:xx:xx:xx:xx:xx is duplicate on 10.0.2.0/22
Заранее спасибо за любую помощь.

Надеюсь автор еще жив

.
В состояние normal оба сервера переходят после старта? Покажите лог старта.

Если вы решили проблему отзовитесь, у меня другая проблема с этим делом. А именно при выходе из строя любого из серверов, адреса зарезервированные на умершем сервере не выделяются новым клиентам. Вот справка из мана:

It is possible during a prolonged failure to tell the remaining server that the other server is down, in which case the remaining server will (over time) reclaim all the addresses the other server had available for allocation, and begin to reuse them. This is called putting the server into the PARTNER-DOWN state.

Т.е. Через какое-то время оставшийся в работе должен их выдавать. Я ждал 12 часов, но рабочий сервак так и не начал выдавать вторую половину из пула.
Кто настраивал такую схему откликнитесь пожалуйста.

drBatty · Сообщение **drBatty** » 29.01.2014 08:29

BAF писал(а): ↑
29.01.2014 07:41
Т.е. Через какое-то время оставшийся в работе должен их выдавать. Я ждал 12 часов, но рабочий сервак так и не начал выдавать вторую половину из пула.

ждать не надо. Обиженные клиенты будут рассылать broadcast запросы, и оставшийся сервер будет выдавать адреса. Т.е. инициатива должна исходить от клиентов.

BAF · Сообщение **BAF** » 29.01.2014 10:03

В то то и дело, что не выдает.

Код: Выделить всё

Jan 29 12:01:00 test dhcpd: DHCPACK on 10.43.28.250 to 00:51:38:93:bf:33 (DHCP-dropper) via 10.43.28.1
Jan 29 12:01:00 test dhcpd: DHCPDISCOVER from 00:5b:bd:97:7d:1d via 10.43.28.1: peer holds all free leases
Jan 29 12:01:03 test dhcpd: DHCPDISCOVER from 00:5b:bd:97:7d:1d via 10.43.28.1: peer holds all free leases
Jan 29 12:01:06 test dhcpd: DHCPDISCOVER from 00:5b:bd:97:7d:1d via 10.43.28.1: peer holds all free leases
Jan 29 12:01:09 test dhcpd: DHCPDISCOVER from 00:5b:bd:97:7d:1d via 10.43.28.1: peer holds all free leases
Jan 29 12:01:12 test dhcpd: DHCPDISCOVER from 00:5b:bd:97:7d:1d via 10.43.28.1: peer holds all free leases

Он считает что другая половина у другого сервака и все.

Сообщение **Bizdelnick** » 29.01.2014 10:12

BAF писал(а): ↑
29.01.2014 07:41
It is possible during a prolonged failure to tell the remaining server that the other server is down, in which case the remaining server will (over time) reclaim all the addresses the other server had available for allocation, and begin to reuse them. This is called putting the server into the PARTNER-DOWN state.

То есть, я так понимаю, надо ему это явно сказать, а не ждать, пока сам догадается.

BAF · Сообщение **BAF** » 29.01.2014 10:40

Bizdelnick писал(а): ↑
29.01.2014 10:12
BAF писал(а): ↑
29.01.2014 07:41
It is possible during a prolonged failure to tell the remaining server that the other server is down, in which case the remaining server will (over time) reclaim all the addresses the other server had available for allocation, and begin to reuse them. This is called putting the server into the PARTNER-DOWN state.

То есть, я так понимаю, надо ему это явно сказать, а не ждать, пока сам догадается.

я понял что он должен перейти в состояние PARTNER-DOWN с течением времени или можно это сделать вручную через omapi, но это же все делается для отказоустойчивой конфигурации, и если у меня сервер партнер выйдет из строя а я буду на мальдивах, мне что вручную заходить по ssh и переводить его в это состояние? Это бред, так не должно быть. Но возможно это так и есть, что весьма печально.

Помогите разобраться. Вот что я нарыт тут

Цытирую фрагмент

9.9.1. Upon entry to COMMUNICATIONS-INTERRUPTED state

When a server enters COMMUNICATIONS-INTERRUPTED state, if it has been
configured to support an automatic transition out of COMMUNICATIONS-
INTERRUPTED state and into PARTNER-DOWN state (i.e., a "safe period"
has been configured, see section 10), then a timer MUST be started
for the length of the configured safe period.

Тут я так понял (у меня engnlish is bad)) говорится, что есть некий "safe period" который должен быть сконфигурирован и по истечению которого сервер из состояния COMMUNICATIONS-INTERRUPTED перейдет в PARTNER-DOWN. Я правильно понял?

Если да то вот 10 пункт:

Код: Выделить всё

10.  Safe Period

   Due to the restrictions imposed on each server while in
   COMMUNICATIONS-INTERRUPTED state, long-term operation in this state
   is not feasible for either server.  One reason that these states
   exist at all, is to allow the servers to easily survive transient
   network communications failures of a few minutes to a few days
   (although the actual time periods will depend a great deal on the
   DHCP activity of the network in terms of arrival and departure of
   DHCP clients on the network).

   Eventually, when the servers are unable to communicate, they will
   have to move into a state where they no longer can re-integrate
   without some possibility of a duplicate IP address allocation.  There
   are two ways that they can move into this state (known as PARTNER-
   DOWN).

   They can either be informed by external command that, indeed, the
   partner server is down.  In this case, there is no difficulty in mov-
   ing into the PARTNER-DOWN state since it is an accurate reflection of
   reality and the protocol has been designed to operate correctly (even
   during reintegration) as long as, when in PARTNER-DOWN state the
   partner is, indeed, down.

   The more difficult scenario is when the servers are running unat-
   tended for extended periods, and in this case an option is provided
   to configure something called a "safe-period" into each server.  This
   OPTIONAL safe-period is the period after which either the primary or
   secondary server will automatically transition to PARTNER-DOWN from
   COMMUNICATIONS-INTERRUPTED state.  If this transition is completed
   and the partner is not down, then the possibility of duplicate IP
   address allocations will exist.

   The goal of the "safe-period" is to allow network operations staff
   some time to react to a server moving into COMMUNICATIONS-INTERRUPTED
   state.  During the safe-period the only requirement is that the net-
   work operations staff determine if both servers are still running --
   and if they are, to either fix the network communications failure
   between them, or to take one of the servers down before the  expira-
   tion of the safe-period.

   The length of the safe-period is installation dependent, and depends
   in large part on the number of unallocated IP addresses within the
   subnet address pool and the expected frequency of arrival of

   previously unknown DHCP clients requiring IP addresses.  Many
   environments should be able to support safe-periods of several days.

   During this safe period, either server will allow renewals from any
   existing client.  The only limitation concerns the need for IP
   addresses for the DHCP server to hand out to new DHCP clients and the
   need to re-allocate IP addresses to different DHCP clients.

   The number of "extra" IP addresses required is equal to the expected
   total number of new DHCP clients encountered during the safe period.
   This is dependent only on the arrival rate of new DHCP clients, not
   the total number of outstanding leases on IP addresses.

   In the unlikely event that a relatively short safe period of an hour
   is all that can be used (given a dearth of IP addresses or a very
   high arrival rate of new DHCP clients), even that can provide sub-
   stantial benefits in allowing the DHCP subsystem to ride through
   minor problems that could occur and be fixed within that hour.  In
   these cases, no possibility of duplicate IP address allocation
   exists, and re-integration after the failure is solved will be
   automatic and require no operator intervention.

Я так и не понял как конфигурить это период. Помогите понять смысл перевода товарищи.

Забыл сказать что в состоянии COMMUNICATIONS-INTERRUPTED он считает что сервер партнер работает просто с ним нет связи и не выделяет адреса, которые были backucp у партнера. Как раз как у меня. Остается только понять или сконфигурить, что бы он переходил в состояние DOWN и ообеспечивал клиентов всем свободным пулом. Но как?

Сообщение **Bizdelnick** » 29.01.2014 11:04

BAF писал(а): ↑
29.01.2014 10:40
если у меня сервер партнер выйдет из строя а я буду на мальдивах, мне что вручную заходить по ssh и переводить его в это состояние? Это бред, так не должно быть.

BAF
А покажите Ваш конфиг.

BAF · Сообщение **BAF** » 29.01.2014 12:03

Первичный

Код: Выделить всё

ddns-update-style none;
ddns-updates off;

#default-lease-time 14400;
#max-lease-time 86400;

default-lease-time 300;
max-lease-time 600;

ping-check true;
ping-timeout 1;

# If this DHCP server is the official DHCP server for the local
# network, the authoritative directive should be uncommented.
authoritative;

# Use this to send dhcp log messages to a different log file (you also
# have to hack syslog.conf to complete the redirection).
log-facility local7;

omapi-port 7911;

failover peer "test" {
  primary;
  address 10.43.29.5;
  port 519;
  peer address 10.43.29.6;
  peer port 520;
  max-response-delay 60;
  max-unacked-updates 10;
  mclt 3600;
  split 128;
  load balance max seconds 5;
}

include "/etc/dhcp/local_network.conf";
include "/etc/dhcp/test_network.conf";

Код: Выделить всё

more /etc/dhcp/test_network.conf

shared-network 28{
        subnet 10.43.28.0 netmask 255.255.255.0 {
                pool {
                        failover peer "test";
                        deny dynamic bootp clients;
                        range 10.43.28.200 10.43.28.250;
                        }
                option subnet-mask 255.255.255.0;
                option ntp-servers 10.43.28.1;
                next-server 10.43.28.1;
                option routers 10.43.28.1;
                }
}

Вторичный

Код: Выделить всё

ddns-update-style none;
ddns-updates off;


#default-lease-time 14400;
#max-lease-time 86400;


default-lease-time 300;
max-lease-time 600;

ping-check true;
ping-timeout 1;


# If this DHCP server is the official DHCP server for the local
# network, the authoritative directive should be uncommented.
authoritative;

# Use this to send dhcp log messages to a different log file (you also
# have to hack syslog.conf to complete the redirection).
log-facility local7;

failover peer "test" {
  secondary;
  address 10.43.29.6;
  port 520;
  peer address 10.43.29.5;
  peer port 519;
  max-response-delay 60;
  max-unacked-updates 10;
  mclt 3600;
#  split 128;
  load balance max seconds 3;
}

include "/etc/dhcp/local_network.conf";
include "/etc/dhcp/test_network.conf";

Код: Выделить всё

more /etc/dhcp/test_network.conf
shared-network 28 {
        default-lease-time 14400;
        subnet 10.43.28.0 netmask 255.255.255.0 {
                pool {
                        failover peer "test";
                        deny dynamic bootp clients;
                        range 10.43.28.200 10.43.28.250;
                        }
                option subnet-mask 255.255.255.0;
                option ntp-servers 10.43.28.1;
                next-server 10.43.28.1;
                option routers 10.43.28.1;
                }
}

Сообщение **Bizdelnick** » 29.01.2014 12:21

https://kb.isc.org/article/AA-00502/31/A-Ba...P-Failover.html
п. 7

BAF · Сообщение **BAF** » 29.01.2014 12:51

Bizdelnick писал(а): ↑
29.01.2014 12:21
https://kb.isc.org/article/AA-00502/31/A-Ba...P-Failover.html
п. 7

7) Configure OMAPI and define a secret key.

# insert this (with your own key text substituted) into dhcpd.conf on primary and secondary..

omapi-port 7911;
omapi-key omapi_key;

key omapi_key {
algorithm hmac-md5;
secret Ofakekeyfakekeyfakekey==;
}

Я делал это и безрезультатно.

OMAPI -- нужна для управления самим демоном, что бы к примеру вручную перевести в shutdown или получать информацию от DHCP. . и биез этого омапи все должно работать т.к. Демоны обнениваются пакетами по портам и адресам указанным в секции failover

Или я не прав?

Сообщение **Bizdelnick** » 29.01.2014 13:23

BAF писал(а): ↑
29.01.2014 12:51
Я делал это и безрезультатно.

А ключ корректно сгенерировали?

BAF · Сообщение **BAF** » 29.01.2014 13:24

Bizdelnick писал(а): ↑
29.01.2014 13:23
BAF писал(а): ↑
29.01.2014 12:51
Я делал это и безрезультатно.

А ключ корректно сгенерировали?

Скопировал его из статьи

Сообщение **Bizdelnick** » 29.01.2014 13:27

Надо генерить dnssec-keygen.

BAF · Сообщение **BAF** » 29.01.2014 13:34

Bizdelnick писал(а): ↑
29.01.2014 13:27
Надо генерить dnssec-keygen.

Вы обьясните зачем это надо? Я видимо что-то в документации опустил? Он же спокойно переходит в разные состояния кроме DOWN и SHUTDAUN и вручную работает

Код: Выделить всё

#!/bin/sh

#  uses omshell to connect to a dhcp server on the
#  local machine, create a control object, set the
#  state of the control object, and update the
#  running server to cause that server to shut down
#  gracefully.
#
#  per dhcpd man page, server shutdown can take
#  several seconds as the server waits for close
#  on all OMAPI connections.  Watching log files
#  for shutdown messages is recommended.

omshell << END_OF_INPUT > /dev/null 2> /dev/null
server localhost
port 7911
key omapi_key Ofakekeyfakekeyfakekey==
connect
new control
open
set state=2
update
END_OF_INPUT

 echo "done sending shutdown instruction to dhcp server.."

В чем смысл генерить еще и вообще зачем это надо использовать? Почему вы считаете что без этого ни как?

Сообщение **Bizdelnick** » 29.01.2014 13:57

Я не считаю, просто в статье в официальной базе знаний наверняка ерунду писать не станут. Обратите ещё внимание на пункт об обязательной синхронизации времени. И логи посмотрите, конечно, если ещё этого не сделали.

BAF · Сообщение **BAF** » 29.01.2014 14:09

Omapi я так понял написали не спроста, а что бы можно было вручную вносить изменения. Мне же да и все таким же нужно это делать автоматически, для этого я это и делаю. Что бы спасть спокойно ночью.
Время синхронизировано на серваках. В логах все верно

Код: Выделить всё

Jan 29 14:44:13 test dhcpd: failover peer test: I move from normal to startup
Jan 29 14:44:13 test dhcpd: failover peer test: peer moves from normal to communications-interrupted
Jan 29 14:44:13 test dhcpd: failover peer test: I move from startup to normal
Jan 29 14:44:13 test dhcpd: balancing pool 702ae430 28  total 51  free 23  backup 23  lts 0  max-own (+/-)5
Jan 29 14:44:13 test dhcpd: balanced pool 702ae430 28  total 51  free 23  backup 23  lts 0  max-misbal 7
Jan 29 14:44:13 test dhcpd: failover peer test: peer moves from communications-interrupted to normal

Когда один падает то на втором

Код: Выделить всё

Jan 29 14:46:28 dhcp2 dhcpd: peer test: disconnected
Jan 29 14:46:28 dhcp2 dhcpd: failover peer test: I move from normal to communications-interrupted

И все дальше он работает в этом режиме а в режим DOWN не переходит. На Cisco к пкримеру я нашел как изменить safe-period

Код: Выделить всё

dhcp set failover-safe-period=24h

и по умолчанию если не задано стоит 24ч
А в линуксе на isc dhcp сколько по умолчанию и как изменить вопрос.

BAF · Сообщение **BAF** » 30.01.2014 21:09

Фуууууу!!!! Ура!!! Я откопал решение своей проблемы.
1 - вариант ручной как и рекомендовали выше

Код: Выделить всё

omshell << EOF
connect
new failover-state
set name = "test"
open
set local-state = 3
update
EOF

Где "test" имя вашего failover из global, а 3 в какое состояние перейти. ниже список всех состояний для 4 версии ISC DHCP

Код: Выделить всё

1 - startup
2 - normal
3 - communications interrupted
4 - partner down
5 - potential conflict
6 - recover
7 - paused
8 - shutdown
9 - recover done
10 - resolution interrupted
11 - conflict done
254 - recover wait

2 - вариант автоматический и больше подходит для мальдив )))))). Просто в конфиг failover добавляем

Код: Выделить всё

auto-partner-down sec;

auto-partner-down sec; This statement specifies a sec seconds time delay upon entering the communications-interrupted state when the server is unable to communicate with its failover peer. The default is 0. This option should be used with care.

Как раз по умолчанию отключено. Вот же изверги. Ну хоть бы в каком-то мане написали это. Уже проверил работает.

Код: Выделить всё

Jan 30 22:45:17 test dhcpd: failover peer test: I move from communications-interrupted to startup
Jan 30 22:45:33 test dhcpd: failover peer test: I move from startup to communications-interrupted
Jan 30 22:46:33 test dhcpd: failover peer test: I move from communications-interrupted to partner-down

Фууу-х, как я долго с этим парился. В любом случаи спасибо за поддержку. Проблема решена.

П.С. Я был прав не нужно ключи генерить, достаточно только указать порт. Да и пример скрипта на офф сайте не рабочий, видимо для 3 версии, а сейчас уже 4.2 текущая стабильная в репах.

unixforum.org

Проблема Dhcpd and failover (как быть с сообщениям uid lease is duplicate)

Проблема Dhcpd and failover

Re: Проблема Dhcpd and failover

Re: Проблема Dhcpd and failover

Re: Проблема Dhcpd and failover

Re: Проблема Dhcpd and failover

Re: Проблема Dhcpd and failover

Re: Проблема Dhcpd and failover

Re: Проблема Dhcpd and failover

Re: Проблема Dhcpd and failover

Re: Проблема Dhcpd and failover

Re: Проблема Dhcpd and failover

Re: Проблема Dhcpd and failover

Re: Проблема Dhcpd and failover

Re: Проблема Dhcpd and failover

Re: Проблема Dhcpd and failover

Re: Проблема Dhcpd and failover

Re: Проблема Dhcpd and failover