### Machine Learning in Cloud

Machine Learning in Cloud

## Set

Python has a built-in method called `set`. set type has the following characteristics

• Sets are a collection which is unordered and unindexed
• Set elements are unique. Duplicate elements are not allowed.
• A set itself may be mutable, but the elements within a set is immutable.

You can create sets in two ways:

1. using `set` method followed by a parenthesis `()`.
2. using curly brackets `{}`.

## `set()`

You can have an ITERABLE object such as list or tuple within `set(<iter>)`. This returns the list or tuple as a `set` wrapped in a curly bracket `{}`. Any iterable object can be converted to a set using `set()`. You can think of `set()` as `extend()` method of lists.

A list within `set()`.

```numbers = set([1, 2, 3, 4, 5, 6, 7])
print(numbers)```

Output:

`{1, 2, 3, 4, 5, 6, 7}`

A tuple within `set()`.

```numbers = set((1, 2, 3, 4, 5, 6, 7))
print(numbers)```

Output:

`{1, 2, 3, 4, 5, 6, 7}`

A string with `a set()`.

```my_letters = set(\'ABCDEF\')
print(my_letters)```

Output:

`{\'E\', \'F\', \'C\', \'B\', \'A\', \'D\'}`

While converting an iterable objects to a set, the returned set is deduplicated.

```# Example 1
my_cities = set([\'Krakow\', \'Warsaw\', \'Warsaw\', \'Kielce\'])

print(my_cities)

# Example 2
my_letters = set(\'AaBBCCDDEEE\')

print(my_letters)

# Example 3
my_numbers = set(\'12345342\')

print(my_numbers)```

Outputs:

```# Example 1 output
{\'Krakow\', \'Kielce\', \'Warsaw\'}

# Example 2 output
{\'D\', \'E\', \'B\', \'C\', \'a\', \'A\'}

# Example 3 output
{\'4\', \'1\', \'3\', \'2\', \'5\'}```

You can see that the output are unordered and deduplicated. The original orders are not kept. `set()` only accepts an object that is iterable such as a string, list or tuple. For example, integers are not iterable and it raises an error, to be specific `TypeError`, while we try to create a set with integer.

```my_numbers = set(12345342)
print(my_numbers)```

Output:

```TypeError                                 Traceback (most recent call last)
<ipython-input-5-2b17322ba0b5> in <module>()
----> 1 my_numbers = set(12345342)
2
3 print(my_numbers)

TypeError: \'int\' object is not iterable```

## `curly bracket {}`

You can create a `set` using curly brackets `{}`. Curly brackets `{}` must have only IMMUTABLE objects. Each element has to separated by a comma, similar to lists and tuples, in other words, a set can be created as `{<obj1>, <obj2>, <obj3>, ......, <objn>}`.

```# Example 1
my_cities = {\'Krakow\', \'Warsaw\', \'Warsaw\', \'Kielce\'}

print(my_cities)

# Example 2
my_letters = {\'AaBBCCDDEEE\'}

print(my_letters)

# Example 3
my_numbers = {12345342}

print(my_numbers)```

Outputs:

```# Example 1 output
{\'Kielce\', \'Warsaw\', \'Krakow\'}

# Example 2 output
{\'AaBBCCDDEEE\'}

# Example 3 output
{12345342}```

As you can see, the curly brackets do not iterate through iterable elements. Each object is present in the set intact regardless of iterability.

## Empty set

set can also be empty, as we had empty list and empty tuple. You can create an empty set using built-in function of set() only because Python interprets empty curly brackets `{}` as an empty dictionary.

```empty_set = set()

# Check empty_set type
print(type(empty_set))

print(empty_set)```

Outputs:

```# print(type(empty_set))
<class \'set\'>

# print(empty_set)
set()```

## Mixed datatypes set

A set can have a mixed datatypes

```# set function
mixed_set = set([34, 3.2, \'cat\', 1.858, False, True, \'Name\'])

print(mixed_set)

# Curly brackets
mixed_set_curly = {34, 3.2, \'cat\', 1.858, False, True, \'Name\'}

print(mixed_set_curly)```

Outputs:

```# set function
{False, 1.858, 34, 3.2, True, \'cat\', \'Name\'}

# Curly brackets
{False, 1.858, 34, 3.2, True, \'cat\', \'Name\'}```

## How to add element(s) to a set?

Sets are unordered and changing with indexing brackets is not possible. Sets are mutable, but we cannot perform slicing or indexing operations to access its elements. Python raises `TypeError` when you use indexing or slicing operation.

```number_set = {1, 2, 3, 4}
print(number_set[:2])```

Output:

```TypeError                                 Traceback (most recent call last)
<ipython-input-11-c24bd2d35a09> in <module>()
1 number_set = {1, 2, 3, 4}
2
----> 3 print(number_set[:2])

TypeError: \'set\' object is not subscriptable```

You can use set method of `add()`to add an element. `add()` method can be used to add an element, it takes only an arguments (`add(<obj>)`).

```new_set = {9, 8, 7, 6}
print(new_set)```

Output:

```# print(new_set)
{5, 6, 7, 8, 9}```

You can use set method `update()` to add elements . `update()` requires an iterable datatype (simple or complex) (`update(<iter>)`).

```new_set = {9, 8, 7, 6}
new_set.update([5, 2, 4, 3])
print(new_set)```

Output:

```# print(new_set)
{2, 3, 4, 5, 6, 7, 8, 9}```

## How to delete element(s) from a set?

You can delete an element from a set using `discard()` or `remove()`.

### remove function

`remove()` will delete the element where it is present and raises a `KeyError` where the element is absent.

```# Element is present
new_set = {9, 8, 7, 6}

new_set.remove(8)
print(new_set)

# Element is absent
new_set = {9, 8, 7, 6}

new_set.remove(5)
print(new_set)```

Outputs:

```# Element is present
# print(new_set)
{9, 6, 7}

# Element is absent
# print(new_set)
KeyError                                  Traceback (most recent call last)
<ipython-input-23-25ec930c2723> in <module>()
1 new_set = {9, 8, 7, 6}
2
----> 3 new_set.remove(5)
4
5 print(new_set)

KeyError: 5```

You can also use `discard()` to delete an element. if element is a member of the set, then removes it, but it does nothing when element is not a member of a set.

```# Element is present
new_set = {9, 8, 7, 6}

print(new_set)

# Element is absent
new_set = {9, 8, 7, 6}

print(new_set)```

Output:

```# Element is present
# print(new_set)
{9, 6, 7}

# Element is absent
# print(new_set)
{8, 9, 6, 7}```

### pop function

You can use pop() on a set. pop() returns an arbitrary element because sets are unordered.

```new_set = {9, 8, 5, 4, 7, 6}

print(new_set.pop())
print(new_set)```

Outputs:

```print(new_set.pop())
4

print(new_set)
{5, 6, 7, 8, 9}```

## Set methods and operators

You can use Python set methods and operators to perform operations such as union, intersection, difference and symmetric difference.

### union

The set made by combining the elements of two sets. Union of set 1 and set 2 is the whole circles.

You can use `union()` method or `| operator`.

```set_1 = {4, 5, 6, 7, 8, 9}
set_2 = {1, 4, 3, 5, 6}

# method 1
new_set = set_1.union(set_2)
print(new_set)

# method 2
new_set_2 = set_1 | set_2
print(new_set_2)
# print(new_set)
{1, 3, 4, 5, 6, 7, 8, 9}

# print(new_set_2)
{1, 3, 4, 5, 6, 7, 8, 9}```

Outputs:

```# print(new_set)
{1, 3, 4, 5, 6, 7, 8, 9}

# print(new_set_2)
{1, 3, 4, 5, 6, 7, 8, 9}```

`| operator` creates a union of two sets (both side have to be sets), otherwise, it raises an error. While `union()` takes an iterable and converts it to a set before performing union operation. See example below; notice that the second set is a tuple.

```set_1 = {4, 5, 6, 7, 8, 9}
set_2 = (1, 4, 3, 5, 6)

# method 1
new_set = set_1.union(set_2)
print(new_set)

# method 2
print(set_1 | set_2)```

Outputs:

```# print(new_set)
{1, 3, 4, 5, 6, 7, 8, 9}

#print(set_1 | set_2)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-8-5ce450bc75fa> in <module>()
8
9 # method 2
---> 10 print(set_1 | set_2)

TypeError: unsupported operand type(s) for |: \'set\' and \'tuple\'```

As you can see, the `union` runs successfully but `| operator` raises `TypeError`.

### intersection

set intersection is the elements that are only in both sets or the elements which are overlapping. Intersect of set1 and set 2 (set 1 ^ set 2 section only)

You can use `intersection()` method or`& operator` to get intersect of two sets.

```set_1 = {4, 5, 6, 7, 8, 9}
set_2 = {1, 4, 3, 5, 6}

# method 1
new_set = set_1.intersection(set_2)
print(new_set)

# method 2
new_set_2 = set_1 & set_2
print(new_set_2)```

Outputs:

```# print(new_set)
{4, 5, 6}

# print(new_set_2)
{4, 5, 6}```

### difference

You can use `difference()` method or`- operator`to get intersect of two sets. The difference of set A and set B is a set of elements that are only present in set A but not set B. The difference of set B and set A is vice versa.

#### set 1 difference

```set_1 = {4, 5, 6, 7, 8, 9}
set_2 = {1, 4, 3, 5, 6}

# method 1
set_1_diff = set_1.difference(set_2)
print(set_1_diff)

# method 2
set_1_diff_op = set_1 - set_2
print(set_1_diff_op)```

Outputs:

```# print(set_1_diff)
{7, 8, 9}

#print(set_1_diff_ops)
{7, 8, 9}```

#### set 2 difference

```set_1 = {4, 5, 6, 7, 8, 9}
set_2 = {1, 4, 3, 5, 6}

# method 1
set_2_diff = set_2.difference(set_1)
print(set_2_diff)

# method 2
set_2_diff_op = set_2 - set_1
print(set_2_diff_op)```

Outputs:

```# print(set_2_diff)
{1, 3}

#print(set_2_diff_ops)
{1, 3}```

### symmetric difference

symmetric difference is a set that contains all the elements from set A and set B that is not shared. It can be seen as opposite of intersection.

You can You can use `symmetric_difference()` method or `^ operator`.

```set_1 = {4, 5, 6, 7, 8, 9}
set_2 = {1, 4, 3, 5, 6}

# method 1
sym_diff = set_1.symmetric_difference(set_2)
print(sym_diff)

# method 2
sym_diff_op = set_1 ^ set_2
print(sym_diff_op)```

Output:

```# print(sym_diff)
{1, 3, 7, 8, 9}

#print(sym_diff_ops)
{1, 3, 7, 8, 9}```

All set methods and operators above support multiple set union, intersection, difference and symmetric difference when you are using methods and operators except symmetric difference method.

```set_1 = {4, 5, 6, 7, 8, 9}
set_2 = {1, 4, 3, 5, 6}
set_3 = {1, 5, 6, 10}

# method 1
sym_diff_op = set_1 ^ set_2 ^ set_3
print(sym_diff_op)

# method 2
sym_diff = set_1.symmetric_difference(set_2, set_3)
print(sym_diff)```

Outputs:

```# print(sym_diff_op)
{3, 5, 6, 7, 8, 9, 10}

# print(sym_diff)
TypeError                                 Traceback (most recent call last)
<ipython-input-20-cc52d7471fff> in <module>()
9
10 # method 2
---> 11 sym_diff = set_1.symmetric_difference(set_2, set_3)
12 print(sym_diff)

TypeError: symmetric_difference() takes exactly one argument (2 given)```