Python

How to fix Python asyncio RuntimeError: There is no current event loop in thread …

Problem:

You are trying to run your Python application, but you see an error message like

Traceback (most recent call last):
  [...]
  File "/mnt/KATranslationCheck/CrowdinLogin.py", line 38, in get_crowdin_tokens
    return asyncio.get_event_loop().run_until_complete(async_get_crowdin_tokens(username, password))
  File "/usr/lib/python3.6/asyncio/events.py", line 694, in get_event_loop
    return get_event_loop_policy().get_event_loop()
  File "/usr/lib/python3.6/asyncio/events.py", line 602, in get_event_loop
    % threading.current_thread().name)
RuntimeError: There is no current event loop in thread 'worker 3'.

Solution:

You are trying to run asyncio.get_event_loop() in some thread other than the main thread – however, asyncio only generates an event loop for the main thread.

Use this function instead of asyncio.get_event_loop():

import asyncio

def get_or_create_eventloop():
    try:
        return asyncio.get_event_loop()
    except RuntimeError as ex:
        if "There is no current event loop in thread" in str(ex):
            loop = asyncio.new_event_loop()
            asyncio.set_event_loop(loop)
            return asyncio.get_event_loop()

It will first try asyncio.get_event_loop(). In case that doesn’t work, it will generate a new event loop for the current thread using

loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)

and then returns this event loop.

Note that while this works well for generating the event loop, but depending on the way you use the event loop, you might encounter further error messages like

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/pyppeteer/launcher.py", line 305, in launch
    return await Launcher(options, **kwargs).launch()
  File "/usr/local/lib/python3.6/dist-packages/pyppeteer/launcher.py", line 157, in launch
    signal.signal(signal.SIGINT, _close_process)
  File "/usr/lib/python3.6/signal.py", line 47, in signal
    handler = _signal.signal(_enum_to_int(signalnum), _enum_to_int(handler))
ValueError: signal only works in main thread

that are usually not as easy to fix and require some restructuring of your program.

Posted by Uli Köhler in Python

How to fix pyppeteer pyppeteer.errors.BrowserError: Browser closed unexpectedly:

Problem:

You want to run your Pyppeteer application on Linux, but you see an error message like

Traceback (most recent call last):
  File "PyppeteerExample.py", line 15, in <module>
    asyncio.get_event_loop().run_until_complete(main())
  File "/usr/lib/python3.6/asyncio/base_events.py", line 484, in run_until_complete
    return future.result()
  File "PyppeteerExample.py", line 6, in main
    browser = await launch()
  File "/usr/local/lib/python3.6/dist-packages/pyppeteer/launcher.py", line 305, in launch
    return await Launcher(options, **kwargs).launch()
  File "/usr/local/lib/python3.6/dist-packages/pyppeteer/launcher.py", line 166, in launch
    self.browserWSEndpoint = get_ws_endpoint(self.url)
  File "/usr/local/lib/python3.6/dist-packages/pyppeteer/launcher.py", line 225, in get_ws_endpoint
    raise BrowserError('Browser closed unexpectedly:\n')
pyppeteer.errors.BrowserError: Browser closed unexpectedly:

Solution:

In most cases, the underlying error for this error message is Puppetteer’s libX11-xcb.so.1: cannot open shared object file: No such file or directory. In order to fix that, you need to install dependency libraries for Chromium which is used internally by Puppeteer / Pyppeteer:

sudo apt install -y gconf-service libasound2 libatk1.0-0 libc6 libcairo2 libcups2 libdbus-1-3 libexpat1 libfontconfig1 libgcc1 libgconf-2-4 libgdk-pixbuf2.0-0 libglib2.0-0 libgtk-3-0 libnspr4 libpango-1.0-0 libpangocairo-1.0-0 libstdc++6 libx11-6 libx11-xcb1 libxcb1 libxcomposite1 libxcursor1 libxdamage1 libxext6 libxfixes3 libxi6 libxrandr2 libxrender1 libxss1 libxtst6 ca-certificates fonts-liberation libappindicator1 libnss3 lsb-release xdg-utils wget

 

Posted by Uli Köhler in Pyppeteer, Python

How to get hostmask/netmask for given prefix length in Python

In order to get the host mask for e.g. a /112 IPv6 prefix, use:

import ipaddress
# Get netmask for a /112 prefix
ipaddress.IPv6Network("::/112").netmask

# Get host mask for a /112 prefix
ipaddress.IPv6Network("::/112").hostmask

 

Posted by Uli Köhler in Networking, Python

How to fix Python3 TypeError: unsupported operand type(s) for &: ‘bytes’ and ‘bytes’

Problem:

You want to perform bitwise boolean operations on bytes() arrays in Python, but you see an error message like

TypeError: unsupported operand type(s) for &: 'bytes' and 'bytes'

or

TypeError: unsupported operand type(s) for |: 'bytes' and 'bytes'

or

TypeError: unsupported operand type(s) for ^: 'bytes' and 'bytes'

Solution:

Python can’t perform bitwise operations directly on byte arrays. However, you can use the code from How to perform bitwise boolean operations on bytes() in Python3:

def bitwise_and_bytes(a, b):
    result_int = int.from_bytes(a, byteorder="big") & int.from_bytes(b, byteorder="big")
    return result_int.to_bytes(max(len(a), len(b)), byteorder="big")

def bitwise_or_bytes(a, b):
    result_int = int.from_bytes(a, byteorder="big") | int.from_bytes(b, byteorder="big")
    return result_int.to_bytes(max(len(a), len(b)), byteorder="big")

def bitwise_xor_bytes(a, b):
    result_int = int.from_bytes(a, byteorder="big") ^ int.from_bytes(b, byteorder="big")
    return result_int.to_bytes(max(len(a), len(b)), byteorder="big")

# Example usage:

a = bytes([0x00, 0x01, 0x02, 0x03])
b = bytes([0x03, 0x02, 0x01, 0xff])

print(bitwise_and_bytes(a, b)) # b'\x00\x00\x00\x03'
print(bitwise_or_bytes(a, b)) # b'\x03\x03\x03\xff'
print(bitwise_xor_bytes(a, b)) # b'\x03\x03\x03\xfc'
Posted by Uli Köhler in Python

How to perform bitwise boolean operations on bytes() in Python3

Performing bitwise operations on bytes() instances in Python3.2+ is easy but not straightforward:

  1. Use int.from_bytes(...) to acquire an integer representing the byte array
  2. Perform bitwise operations with said integer
  3. Use result.to_bytes(...) to convert back the integer to a bytes() array

Note that for the result to make any sense, you need to ensure that both bytes() instances have the same length.

Python code:

def bitwise_and_bytes(a, b):
    result_int = int.from_bytes(a, byteorder="big") & int.from_bytes(b, byteorder="big")
    return result_int.to_bytes(max(len(a), len(b)), byteorder="big")

def bitwise_or_bytes(a, b):
    result_int = int.from_bytes(a, byteorder="big") | int.from_bytes(b, byteorder="big")
    return result_int.to_bytes(max(len(a), len(b)), byteorder="big")

def bitwise_xor_bytes(a, b):
    result_int = int.from_bytes(a, byteorder="big") ^ int.from_bytes(b, byteorder="big")
    return result_int.to_bytes(max(len(a), len(b)), byteorder="big")

Example usage:

a = bytes([0x00, 0x01, 0x02, 0x03])
b = bytes([0x03, 0x02, 0x01, 0xff])

print(bitwise_and_bytes(a, b)) # b'\x00\x00\x00\x03'
print(bitwise_or_bytes(a, b)) # b'\x03\x03\x03\xff'
print(bitwise_xor_bytes(a, b)) # b'\x03\x03\x03\xfc'

 

Posted by Uli Köhler in Python

Bitwise operation with IPv6 addresses and networks in Python

Python3 features the easy-to-use ipaddress library providing many calculations. However, bitwise boolean operators are not available for addresses.

This post shows you how to perform bitwise operations with IPv6Address() objects. We’ll use the following strategy:

  1. Use .packed to get a binary bytes() instance of the IP address
  2. Use int.from_bytes() to acquire an integer representing the binary address
  3. Perform bitwise operations with said integer
  4. Use result.to_bytes(16, ...) to convert back the integer to a bytes() array in the correct byte order
  5. Construct an IPv6Address() object from the resulting byte array.

Python code:

import ipaddress

def bitwise_and_ipv6(addr1, addr2):
    result_int = int.from_bytes(addr1.packed, byteorder="big") & int.from_bytes(addr2.packed, byteorder="big")
    return ipaddress.IPv6Address(result_int.to_bytes(16, byteorder="big"))

def bitwise_or_ipv6(addr1, addr2):
    result_int = int.from_bytes(addr1.packed, byteorder="big") | int.from_bytes(addr2.packed, byteorder="big")
    return ipaddress.IPv6Address(result_int.to_bytes(16, byteorder="big"))

def bitwise_xor_ipv6(addr1, addr2):
    result_int = int.from_bytes(addr1.packed, byteorder="big") ^ int.from_bytes(addr2.packed, byteorder="big")
    return ipaddress.IPv6Address(result_int.to_bytes(16, byteorder="big"))

Example usage:

a = ipaddress.IPv6Address('2001:16b8:2703:8835:9ec7:a6ff:febe:96b1')
b = ipaddress.IPv6Address('2001:16b8:2703:4241:9ec7:a6ff:febe:96b1')

print(bitwise_and_ipv6(a, b)) # IPv6Address('2001:16b8:2703:1:9ec7:a6ff:febe:96b1')
print(bitwise_or_ipv6(a, b)) # IPv6Address('2001:16b8:2703:ca75:9ec7:a6ff:febe:96b1')
print(bitwise_xor_ipv6(a, b)) # IPv6Address('0:0:0:ca74::')

Similarly, you can use the code in order to manipulate IPv6Network() instances:

a = ipaddress.IPv6Network('2001:16b8:2703:8835:9ec7:a6ff:febe::/112')
b = ipaddress.IPv6Network('2001:16b8:2703:4241:9ec7:a6ff:febe::/112')

print(bitwise_and_ipv6(a.network_address, b.network_address)) # IPv6Address('2001:16b8:2703:1:9ec7:a6ff:febe:0')
print(bitwise_or_ipv6(a.network_address, b.network_address)) # IPv6Address('2001:16b8:2703:ca75:9ec7:a6ff:febe:0')
print(bitwise_xor_ipv6(a.network_address, b.network_address)) # IPv6Address('0:0:0:ca74::')

Note that the return type will always be IPv6Address() and never IPv6Network() since the result of the bitwise operation doesn’t have any netmask associated with it.

Besides .network_address you can also use other properties of IPv6Address() instances like .broadcast_address or .hostmask or .netmask.

Posted by Uli Köhler in Networking, Python

How to get current page URL in pyppeteer

In pyppeteer you can use

url = await page.evaluate("() => window.location.href")

in order to get the current URL. Note that page.evaluate() runs whatever Javascript your give it – hence you can use your Javascript skills in order to create the desired effect.

Full example

import asyncio
from pyppeteer import launch

async def main():
    browser = await launch()
    page = await browser.newPage()
    await page.goto('https://www.techoverflow.net')

    # Get the URL and print it
    url = await page.evaluate("() => window.location.href")
    print(url) # prints https://www.techoverflow.net/

    # Cleanuip
    await browser.close()

asyncio.get_event_loop().run_until_complete(main())

 

Posted by Uli Köhler in Pyppeteer, Python

How simulate click using pyppeteer

In order to click a button or a link using the the pyppeteer library, you can use page.evaluate().

If you have an <button> element or a link (<a>) like

<button id="mybutton">

you can use

# Now click the search button    
await page.evaluate(f"""() => {{
    document.getElementById('mybutton').dispatchEvent(new MouseEvent('click', {{
        bubbles: true,
        cancelable: true,
        view: window
    }}));
}}""")

in order to generate a MouseEvent that simulates a click. Note that page.evaluate() will run any Javascript code you pass to it, so you can use your Javascript skills in order to create the desired effect

Also see https://gomakethings.com/how-to-simulate-a-click-event-with-javascript/ for more details on how to simulate mouse clicks in pure Javascript without relying on jQuery.

Note that page.evaluate() will just run any Javascript code you give it, so you can put your Javascript skills to use in order to manipulate the page.

Full example

This example will open https://techoverflow.net, enter a search term into the search field, click the search button and then create a screenshot

import asyncio
from pyppeteer import launch

async def main():
    browser = await launch()
    page = await browser.newPage()
    await page.goto('https://techoverflow.net')

    # Fill content into the search field
    content = "pypetteer"
    await page.evaluate(f"""() => {{
        document.getElementById('s').value = '{content}';
    }}""")

    # Now click the search button    
    await page.evaluate(f"""() => {{
        document.getElementById('searchsubmit').dispatchEvent(new MouseEvent('click', {{
            bubbles: true,
            cancelable: true,
            view: window
        }}));
    }}""")

    # Wait until search results page has been loaded
    await page.waitForSelector(".archive-title")

    # Now take screenshot and exit
    await page.screenshot({'path': 'screenshot.png'})
    await browser.close()

asyncio.get_event_loop().run_until_complete(main())

The result will look like this:

Posted by Uli Köhler in Pyppeteer, Python

How to fill <input> field using pyppeteer

In order to fill an input field using the pyppeteer library, you can use page.evaluate().

If you have an <input> element like

<input name="myinput" id="myinput" type="text">

you can use

content = "My content" # This will be filled into <input id="myinput"> !
await page.evaluate(f"""() => {{
    document.getElementById('myinput').value = '{content}';
}}""")

Note that page.evaluate() will just run any Javascript code you give it, so you can put your Javascript skills to use in order to manipulate the page.

Full example

This example will open https://techoverflow.net, enter a search term into the search field and then create a screenshot

#!/usr/bin/env python3
import asyncio
from pyppeteer import launch

async def main():
    browser = await launch()
    page = await browser.newPage()
    await page.goto('https://techoverflow.net')
    
    # This example fills content into the search field
    content = "My search term"
    await page.evaluate(f"""() => {{
        document.getElementById('s').value = '{content}';
    }}""")

    # Make screenshot
    await page.screenshot({'path': 'screenshot.png'})
    await browser.close()

asyncio.get_event_loop().run_until_complete(main())

The result will look like this:

Posted by Uli Köhler in Pyppeteer, Python

How to get duration of WAV file in Python (minimal example)

Use

duration_seconds = mywav.getnframes() / mywav.getframerate()

to get the duration of a WAV file in seconds.

Full example:

import wave

with wave.open("myaudio.wav") as mywav:
    duration_seconds = mywav.getnframes() / mywav.getframerate()
    print(f"Length of the WAV file: {duration_seconds:.1f} s")

 

Posted by Uli Köhler in Audio, Python

How to create filename containing date/time in Python

In datalogging quiten often you have to create a new log file once you start to log data.

Often it’s convenient to include the current date and time in the log file.  In Python, this is pretty easy to do:

from datetime import datetime

filename = f"Temperature log-{datetime.now():%Y-%m-%d %H-%m-%d}.csv"

This will create filenames like

Temperature log 2020-06-17 22-37-41.csv
Temperature log 2019-12-31 00-15-55.csv

Note that if you use another Date/time format, you need to avoid special characters that must not occur in filenames. The rules for which filename is correct are much easier on Linux than on Windows, but since you should be compatible with both operating systems, you should always check the Windows rules.

These characters are forbidden for Windows filenames:

<>:"/\|?*

The date-time format we used above, %Y-%m-%d %H-%m-%d is specially crafted in order to avoid colons in ISO-8601-like date/time formats such as 2020-04-02 11:45:33 since colons would be illegal in Windows filenames (they would work in Linux filenames, though). %Y-%m-%d %H-%m-%d only contains spaces and dash (-) characters in order to avoid any issues with filename rules.

Posted by Uli Köhler in Python

What fraction of the year has passed until a given Timestamp in pandas?

To compute what fraction of the year has passed since the start of the year, use this function:

import pandas as pd

def fraction_of_year_passed(date):
    """Compute what fraction of the current year has already passed up to the given date"""
    start_of_year = pd.Timestamp(now.year, 1, 1)
    start_of_next_year = pd.Timestamp(now.year + 1, 1, 1)
    # Compute seconds in entire year and seconds since start of year
    entire_year_seconds = (start_of_next_year - start_of_year).total_seconds()
    seconds_since_start_of_year = (date - start_of_year).total_seconds()
    return seconds_since_start_of_year / entire_year_seconds

Usage example:

print(fraction_of_year_passed(pd.Timestamp("2020-03-01"))) # prints 0.16393442622950818

Detailed explanation:

First, we define that start of the calendar year date belongs to, and the start of the calendar year after that:

start_of_year = pd.Timestamp(now.year, 1, 1)
start_of_next_year = pd.Timestamp(now.year + 1, 1, 1)

Now we compute the number of seconds in the entire year and the number of seconds passed between the start of the year and date:

entire_year_seconds = (start_of_next_year - start_of_year).total_seconds()
seconds_since_start_of_year = (date - start_of_year).total_seconds()

The rest is simple: Just divide seconds_since_start_of_year / entire_year_seconds to obtain what fraction of the year has passed until date.

Posted by Uli Köhler in pandas, Python

How to compute number of days in a year in Pandas

In our previous post we showed how to used the pendulum library in order to compute the number of days in a given year using the pendulum library.

This post shows how to achieve the same using pandas:

import pandas as pd
def number_of_days_in_year(year):
    start = pd.Timestamp(year, 1, 1)
    end = pd.Timestamp(year + 1, 1, 1)
    return (end - start).days)

Usage example:

print(number_of_days_in_year(2020)) # Prints 366
print(number_of_days_in_year(2021)) # Prints 365

Explanation:

First, we define the start date to be the first day (1st of January) of the year we’re interested in:

start = pd.Timestamp(year, 1, 1)

Now we generate the end date, which is the 1st of January of the following year:

end = pd.Timestamp(year + 1, 1, 1)

The rest is simple: Just compute the difference (end – start) and ask pandas to give us the number of days:

(end - start).days

 

Posted by Uli Köhler in pandas, Python

How to generate range of dates in pandas

In this example, we’ll create a list of pandas Timestamp objects that represent 100 consecutive days, starting at a fixed date:

start_date = pd.Timestamp("2020-03-01")

Generating the 100 consecutive days is easy:

all_days = [start_date + pd.Timedelta(d, "days") for d in range(100)]

Note that range(100) will generate all numbers from 0 up to and including 99. Hence, [pd.Timedelta(d, "days") for d in range(100)] will generate a list of Timedeltas that represent 0 days, 1 days, 2 days, …, 99 days.

Full example:

import pandas as pd

start_date = pd.Timestamp("2020-03-01")
all_days = [start_date + pd.Timedelta(d, "days") for d in range(100)]

print(all_days)

 

Posted by Uli Köhler in pandas, Python

How to compute number of days in a year in Python using Pendulum

Also see How to compute number of days in a year in Pandas

We can use the excellent pendulum library to find the number of days in a given year

import pendulum

def number_of_days_in_year(year):
    start = pendulum.date(year, 1, 1)
    end = start.add(years=1)
    return (end - start).in_days()

Usage example:

print(number_of_days_in_year(2020)) # Prints 366
print(number_of_days_in_year(2021)) # Prints 365

Explanation:

First, we define the start date to be the first day (1st of January) of the year we’re interested in:

start = pendulum.date(year, 1, 1)

Now we use pendulum‘s add function to add exactly one year to that date. This will always result in the 1st of January of the year after the given year:

end = start.add(years=1)

The rest is simple: Just ask pendulum to give us the number of days in the difference between end and start:

(end - start).in_days()

 

Posted by Uli Köhler in Python

How to skip first element of a Generator/Iterator in Python

Use the skip_first() utility function from UliEngineering:

First, install UliEngineering using

pip install --user UliEngineering

Note that UliEngineering requires Python 3.3+.

Now you can use skip_first() like this:

from UliEngineering.Utils.Iterable import skip_first

for v in skip_first(v for v in [1,2,3,4,5]):
    print(v) # Prints 2,3,4,5

skip_first() will work for any Iterable or Iterator.

Don’t want to install UliEngineering?

Copy the skip_first() utility function into your own code:

import collections

def skip_first(it):
    """
    Skip the first element of an Iterator or Iterable,
    like a Generator or a list.
    This will always return a generator or raise TypeError()
    in case the argument's type is not compatible
    """
    if isinstance(it, collections.Iterator):
        try:
            next(it)
            yield from it
        except StopIteration:
            return
    elif isinstance(it, collections.Iterable):
        yield from skip_first(it.__iter__())
    else:
        raise TypeError(f"You must pass an Iterator or an Iterable to skip_first(), but you passed {it}")

 

Posted by Uli Köhler in Python

How to fix NumPy timedelta64 TypeError: Invalid datetime unit “min” in metadata

Problem:

You want to construct a NumPy timedelta64 from a value in minutes using

np.timedelta64(1, 'min')

but you see an error message like

Traceback (most recent call last):
  File "test.py", line 3, in <module>
    delta = np.timedelta64(1, 'min')
TypeError: Invalid datetime unit "min" in metadata

Solution:

numpy uses m as specifier for minutes, not min! Change your code to

np.timedelta64(1, 'm')

 

Posted by Uli Köhler in Python

Split pandas DataFrame every time a Series is True

In our previous post we explored how to Split pandas DataFrame every time a column is True.

This slightly modified function also works if the given Series is not a column in the DataFrame:

def split_dataframe_by_series(df, series):
    """
    Split a DataFrame where the given series is True. Yields a number of dataframes
    """
    previous_index = df.index[0]

    for split_point in df[series].index:
        yield df[previous_index:split_point]
        previous_index = split_point
    # Yield remainder of dataset
    try:
        yield df[split_point:]
    except UnboundLocalError:
        pass # There is no split point => Ignore

Full example

We’ll use the ZeroCrossing column we built in our previous post on How to detect value change in pandas string column/series which itself builds on our post on How to create pandas time series DataFrame example dataset. Based on that example we add the modified utility function shown above:

import pandas as pd

# Load pre-built time series example dataset
df = pd.read_csv("https://techoverflow.net/datasets/timeseries-example.csv", parse_dates=["Timestamp"])
df.set_index("Timestamp", inplace=True)

# Create a new column containing "Positive" or "Negative"
df["SinePositive"] = (df["Sine"] >= 0).map({True: "Positive", False: "Negative"})
# Create "change" column (boolean)
df["ZeroCrossing"] = df["SinePositive"].shift() != df["SinePositive"]
# Set first entry to False
df["ZeroCrossing"].iloc[0] = False

def split_dataframe_by_series(df, series):
    """
    Split a DataFrame where the given series is True. Yields a number of dataframes
    """
    previous_index = df.index[0]

    for split_point in df[series].index:
        yield df[previous_index:split_point]
        previous_index = split_point
    # Yield remainder of dataset
    try:
        yield df[split_point:]
    except UnboundLocalError:
        pass # There is no split point => Ignore

# Print result
split_frames = list(split_dataframe_by_series(df, df["ZeroCrossing"]))
print(f"Split DataFrame into {len(split_frames)} separate frames by zero-crossing")

Note that converting the result of split_dataframe_to_series() into a list might not be neccessary depending on your application. If possible, I recommend directly iterating the data frames using a for loop, e.g.:

for df_section in split_dataframe_by_series(df, df["ZeroCrossing"]):
    pass # TODO: Your code goes here!

 

Posted by Uli Köhler in pandas, Python

Split pandas DataFrame every time a column is True

TL;DR

If the Series you want to use to split is a column in the DataFrame, continue reading this post. Else, read Split pandas DataFrame every time a Series is True.

Use this utility function:

def split_dataframe_by_column(df, column):
    """
    Split a DataFrame where a column is True. Yields a number of dataframes
    """
    previous_index = df.index[0]

    for split_point in df[df[column]].index:
        yield df[previous_index:split_point]
        previous_index = split_point
    # Yield remainder of dataset
    try:
        yield df[split_point:]
    except UnboundLocalError:
        pass # There is no split point => Ignore

# Usage example:
list(split_dataframe_by_column(df, "ZeroCrossing"))

Note that one or more of those dataframes might be empty.

Full example:

We’ll use the ZeroCrossing column we built in our previous post on How to detect value change in pandas string column/series which itself builds on our post on How to create pandas time series DataFrame example dataset. Based on that example we add the utility function shown above:

import pandas as pd

# Load pre-built time series example dataset
df = pd.read_csv("https://techoverflow.net/datasets/timeseries-example.csv", parse_dates=["Timestamp"])
df.set_index("Timestamp", inplace=True)

# Create a new column containing "Positive" or "Negative"
df["SinePositive"] = (df["Sine"] >= 0).map({True: "Positive", False: "Negative"})
# Create "change" column (boolean)
df["ZeroCrossing"] = df["SinePositive"].shift() != df["SinePositive"]
# Set first entry to False
df["ZeroCrossing"].iloc[0] = False

def split_dataframe_by_column(df, column):
    """Split a DataFrame where a column is True. Yields a number of dataframes"""
    previous_index = df.index[0]

    for split_point in df[df[column]].index:
        yield df[previous_index:split_point]
        previous_index = split_point
    # Yield remainder of dataset
    try:
        yield df[split_point:]
    except UnboundLocalError:
        pass # There is no split point => Ignore

# Print result
split_frames = list(split_dataframe_by_column(df, "ZeroCrossing"))
print(f"Split DataFrame into {len(split_frames)} separate frames by zero-crossing")
# This prints "Split DataFrame into 20 separate frames by zero-crossing"

 

Posted by Uli Köhler in pandas, Python