Class Recording Video on Lean Six Sigma - Basic Statistics

Class Recording Video on Lean Six Sigma - Basic Statistics

Lean Six Sigma - Basic Statistics Session

Why need Data?

  • To analyse performance
  • To understand the system
  • To review the system
  • To take the correct decision
  • To conclude results
  • For Comparison
  • Identify deviations
  • Future Planning
  • For prediction or anticipation
  • Find out discrepancies
  • Comparison with std
  • Determine the cause of the problem
  • Review of plans and Goals
  • Eliminate repeatability
  • Visualize relationship

 

 

 

 

What do we need Data?

  • Analyze the current situation
    • To Analyze trend
  • Prepare for corrective action
  • Predict

 

Central Tendency

Data tends to be close to its centre

CT = 15 Km/Litre

Milege of car

 

 

 

Mean

 Is a good measure of central tendency when there is not too much variation in data.

Arithmetic average

Disadvantage – Gets impacted by extreme high or extreme low values

 

10 Families – 1000 pounds = 10000

Avg = sum of all/ number of all observations

                = 1000 + 1000 + 1000 + 1000 + 1000 1000 + 1000 + 1000 + 1000 + 1000 /10

                Average or Mean = 1000 pounds

LNM =46.5Bn Pounds

 

Avg = 1000 + 1000 + 1000 + 1000 + 1000 1000 + 1000 + 1000 + 1000 + 1000 + 46.5 Bn /11

                = 4.6 Bn Pounds

This locality has billionaires only

 

150 – 194, 28000

 

 

 

 

Median

  • Prefer median when your data has high variation
  • data is arranged in Ascending order. And the 50th position becomes your median.
  • Advantage – Does not get impacted by extreme high or low data points

 

  • Disadvantage – Because it is positional value, it can’t be used for mathematical calculations

Avg = 1L  Total = 1L* 11 = 11L

Median = 1L = 50% time value is less than equal to 1L,

 

 

Recruiters – 35K * 20 = 7 L

Median = 27K –

 

 

 

Mode –

Should be used only when you have limited possibilities

Frequency of occurrence

Batsman =

0 1 2 3 4 5 6

 

1

2

3

4

5

6

7

 

12

14

19

100

2

100

1

 

Which data is occurring the most

Mode = 0

Mode = Dice

1

2

3

4

5

6

12

14

19

85

2

42

 

Mode = 4

1

2

3

4

5

6

12

14

92

92

2

25

 

Mode =3,4 – referred as bi-modal

1

2

3

4

5

6

12

14

25

25

25

2

 

Mode = 3,4,5 – Tri Modal

 

 

Variation

 

Range = Max - Min

 

Quartiles will always have 25% data

Min – Q1 = 25%

Q1 – Median (Q2) = 25%

Q2 – Q3 =

 

IQR – Inter Quartile Data = Q3- Q1

SF = Q1/Q3 best case = 1, worst = as far away as possible

Range = Max – Min

 

 

 

 

1 2 3 4 5 = 15/5 = 3

Distance of data from Centre

Mean = 3

(1-3) + (2-3) + (3-3) + (4-3) + (5-3) =

Mean = 1 + 2 + 3 + 4 + 5/5 = 15/5 = 3

(1-3) + (2-3) + (3-3) + (4-3) + (5-3) = (-2) + (-1) + (0) + (+1) + (+2) = 0

 

 

(1-3)^2 + (2-3) ^2+ (3-3)^2 + (4-3)^2 + (5-3)^2 = 10

4+ 1+0+1+4= 10

1,2,3,4,5,3 -

Sum of Square of distance of data from centre = SS

Avg of Sum of Square of distance of data from centre = Variance

 (1-3)^2 + (2-3) ^2+ (3-3)^2 + (4-3)^2 + (5-3)^2 / n-1

 

 

1 2 3 4 5

Mean = 3

(1-3) + (2-3) + (3-3) + (4-3) + (5-3) = 0

(1-3)^2 + (2-3) ^2  + (3-3) ^2 + (4-3) ^2  + (5-3) ^2 = Sum of Square

4 + 1 + 0 + 1 + 4 = 10 = Sum of Square of data from its centre (mean)

(1-3)^2 + (2-3) ^2  + (3-3) ^2 + (4-3) ^2  + (5-3) ^2  / n-1 = Variance =average of Sum of Square of data from its centre (mean)

Variance = 10/4 = 2.5

 

(1-3)^2 + (2-3)^2 + (3-3)^2 + (4-3)^2 + (5-3)^2 =  Sum of Square

Average of Sum of Square = Variance

Sqr Root of Variance = Std Dev

1,2,3,4,5

Mean =3

(1-3)+ (2-3) + (3-3) + (4-3) + (5-3) = 0

(1-3)^2+ (2-3)^2 + (3-3)^2 + (4-3)^2 + (5-3)^2

4 + 1+ 0 + 1 + 4 = 10 =>> Sum of Square

Variance = Sum of Square/ n-1

Std Dev = Sqr root of Variance

Variation

Mean = 3

(1-3)^2+(2-3) ^2+(3-3) ^2+(4-3) ^2+(5-3) ^2 =

+ 4 + 1 + 0 + 1 + 4 = 10 = Sum of Square >> This is squared sum of distance of data from its centre

SS/n-1 = 10/4 = 2.5 = Variance (Avg sum of Square) >> avergage of squared sum of distance of data from its centre

Variance Sqr Root = Std Dev

 

 

 

 

 

(1-3)^2+(2-3)^2+(3-3)^2+(4-3)^2+(5-3)^2 = Sum of Square

 = Sum of Squared distance of data from its centre

(1-3)^2+(2-3)^2+(3-3)^2+(4-3)^2+(5-3)^2 / n-1 = Variance

Std Dev = Srq Root of Variance

 

 

Normality

If a process is to be considered “Normal” it will follow the below rules

  • Mean + 1 Std Dev Mean – 1 Std Dev = 68%
  • Mean + 2 Std Dev Mean – 2 Std Dev = 95%
  • Mean + 3 Std Dev – 3 Std Dev = 99.73%

The Process is considered non normal or out of control and the reason is considered special cause variation. Statistics mandates that u must conduct RCA for such special behaviour.

 

Mileage----  Mean = 15 Std Dev = 1

 

 

Lower

Upper

Mileage

68%

15 -1=14

15+1=16

14-16

95%

15-2(1)

15+2(1)

13-17

99.73%

15-3(1)

15+3(1)

12-18

 

22 ---

Production

Mean = 200 Std Dev = 20

68% à

 

 

 

 

68%

200 + 20

200 – 20

180 - 220

95%

200 + 2(20)

200 – 2(20)

160 - 240

99.73

200 +3(20)

200 – 3(20)

140 - 260

 

Under normal conditions the production will be 140 – 260 (we are 99.73% sure)

Target - 300

 

 

Route1:           Mean = 30 Min                Std Dev = 4 Min

Route 2:          Mean = 20 Min               Std Dev = 20 Min

Route1

 

 

 

68%

 

 

 

95%

 

 

 

99.73%

30 + 3(4) =30 + 12

30 – 3(4)= 30 – 12 = 18

18  - 42 Min

 

Route2

 

 

 

68%

 

 

 

95%

 

 

 

99.73%

20 + 3(20)= 20 + 60 = 80

20 – 3(20) = 0

0 – 80 Min

 

 

 

HDFC  Avg =>10 Std Dev =1 from 7% to 13 %

Mean + 3 Std Dev = 10 + 3 = 13

Mean – 3 Std Dev = 10 – 3 = 7

ICICI => Mean =20 Std Dev = 20 Return -40% to 80%

Mean + 3 Std Dev = 20 + 3(20) = 20 + 60 = 80

Mean – 3 Std Dev = 20 – 60 = -40%

 

 

 

Milege

Mean = 20 Std Dev 1

68 % -> 21 – 19

95% -à 22 -18

99.73% --à 23 – 17

Under normal circumstance this is what my performance will be

Everytime you see special behaviour – you must conduct RCA

 

Consider that process is “normal” or within control

Even if one data is outside – we term that as special

Mean = 25 Min, Std Dev = 1

 26 -- 24

Mean + 2 Std Dev Mean – 2 Std Dev = 95%

27 Min- 23 Min = >> 95%

Mean + 3 Std Dev – 3 Std Dev = 99.73%

28 -- 22

 

 

 

 

Normality

Normal

Statistics has a definition of the term “normal”

Following criteria:

  • Mean +- 1 Std Dev = 68% of data

13-17

  • Mean + _ 2 Std Dev = 95%      

11 – 19 Milege                

  • Mean +- 3 Std Dev = 99.74 % of data    

15 +6 15-6 = 21----9

If any data adheres to the above, it is referred to as normal (or following normal distribution)

If any of the above is not met – process is considered non normal and there is presence of special cause variation in data.

 

Milege

Mean = 15

Std Dev = 1

16 – 14 – 68% - it is 68% likely that milege shall be between 14 to 16

 

13 to 17 – 95% - You are 95% sure that milege shall be between 13 – 17

12-18 – 99.74

 

 

Mean = 25 Min

Std Dev = 2 Min

25 + 2 = 27

25-2 = 23

23  --- 27 = 68% time

 

 

 

HDFC MF = mean = 10%                  Std Dev = 1

ICICI MF = Mean 40 %                      Std Dev = 20%

 

68%

95%

99.73 %

HDFC

9-11

8-12

7-13

ICICI

20-60

0-80

-20 - 100

 

Normal Data exhibits these properties:

  • Mean Median Mode will be equal
  • Unimodal i.e. you will have only one mode and that shall be at the centre
  • Bell curve will accommodate entire process performance
  • If you divide the bell curve at the centre, you will always have identical behaviour both sides.
  • Bell curve never touches the axis

 

 

 

 

 

 

Mean must be used with Std Dev ( never use mean alone)

Median should be used with Min Q1 Median Q3 Max

 

 

25 min

Std dev =2 min

99.74%

25 + (3*2) = 31

25 – (3*2) = 19

30 min Target

 

95%

25 + 4 = 29

25-4  = 21

 

 

 

 

 

 

Std Deviation

Variation

  • Vis-à-vis Centre
    • Centre = Mean, we want to study distance of data from centre
    • 1,2,3,4,5
      • Mean = 3
      • Distance from Centre
        • (1-3) + (2 -3) + (3-3) + (4-3) + (5-3) = 0
        • Because of this challenge – they squared the same
          • (1-3)^2+ (2 -3)^2 + (3-3)^2 + (4-3)^2 + (5-3)^2
          • Sum of Square is the sum of squared data from its mean
          • Variance = Avg = (1-3)^2+ (2 -3)^2 + (3-3)^2 + (4-3)^2 + (5-3)^2 / n-1
          • Sum of Square = 10/4 = 2.5
        • Std Dev = Sqr root of Variance
        • Root of the avg distance of data from its centre
  • Positional Information
    • Quartiles
      • Q1
      • Q3
      • IQR
      • SF
    • Range
      • Min
      • Max