Function vs Sub(ByRef)

Question

About

This question is not about when to use a Function or a Sub, or the difference between ByRef and ByVal (although some insights will be unavoidable).
It is about scenarios which are 'commonly' solved with a Function, but can optionally be solved with a Sub using ByRef in the sense of 'modifying'.

The Code

Consider the following function:

' Returns the worksheet (object) with a specified name in a specified workbook (object).
Function getWsF(wb As Workbook, _
                ByVal wsName As String) _
         As Worksheet
    ' 'getWsF' is 'Nothing' by default.
    ' Try to define worksheet.
    On Error Resume Next
    Set getWsF = wb.Worksheets(wsName)
End Function

You can utilize it like the following:

' Writes the name of a specified worksheet, if it exists, to the `Immediate` window...
Sub testFunction()
    
    Const wsName As String = "Sheet1"
    Dim wb As Workbook
    Set wb = ThisWorkbook ' The workbook containing this code.
    
    ' Define worksheet.
    Dim ws As Worksheet
    Set ws = getWsF(wb, wsName)
    
    ' Test if worksheet exists.
    If Not ws Is Nothing Then
        Debug.Print "The worksheet name is '" & ws.Name & "'."
    Else
        Debug.Print "Worksheet '" & wsName & "' doesn't exist in workbook '" _
                  & wb.Name & "'."
    End If

End Sub

But you can also write each of the procedures in the following way:

' Although 'ByRef' is not necessary, I'm using it to indicate that whatever
' its variable is referring to in another procedure (in this case
' a worksheet object), is going to be modified (possibly written to
' for other datatypes).
Sub getWsS(ByRef Sheet As Worksheet, _
           wb As Workbook, _
           ByVal wsName As String)
    ' 'Sheet' could be 'Nothing' or an existing worksheet. You could omit
    ' the following line if you plan to use the procedure immediately
    ' after declaring the worksheet object, but I would consider it
    ' as too risky. Therefore:
    ' 'Reinitialize' worksheet variable.
    Set Sheet = Nothing
    ' Try to define worksheet.
    On Error Resume Next
    Set Sheet = wb.Worksheets(wsName)
End Sub

' Writes the name of a specified worksheet, if it exists, to the `Immediate` window...
Sub testSub()
    
    Const wsName As String = "Sheet1"
    Dim wb As Workbook
    Set wb = ThisWorkbook ' The workbook containing this code.
    
    ' Define worksheet.
    Dim ws As Worksheet
    getWsS ws, wb, wsName
    
    ' Test if worksheet exists.
    If Not ws Is Nothing Then
        Debug.Print "The worksheet name is '" & ws.Name & "'."
    Else
        Debug.Print "Worksheet '" & wsName & "' doesn't exist in workbook '" _
                  & wb.Name & "'."
    End If

End Sub

Side by Side

Procedure

Function getWsF(wb As Workbook, _       Sub getWsS(ByRef Sheet As Worksheet, _
                wsName As String) _                wb As Workbook, _
         As Worksheet                              wsName As String)
                                          Set Sheet = Nothing
  On Error Resume Next                    On Error Resume Next
  Set getWsF = wb.Worksheets(wsName)      Set Sheet = wb.Worksheets(wsName)
End Function                            End Sub

Usage (relevant)

  ' Define worksheet.                     ' Define worksheet.
  Dim ws As Worksheet                     Dim ws As Worksheet
  Set ws = getWsF(wb, wsName)             getWsS ws, wb, wsName

The Question(s)

Is the second solution viable?
I'm looking for a proper description of what each of the relevant two procedures do and some insights in terms of common practice, readability, efficiency, pitfalls ...

Object parameters are passed ByRef even if you specify ByVal I do believe. VBA isn't going to make a deep copy unless you code up a way to do it. Since ByRef is the default anyway, you can avoid some keypresses. And since you are specifying Types, I don't think you have to worry about passing Default members instead of the whole object (reference). But readability is not so good. If you haven't looked at your code for months, or someone else looks at it, it could be a real pitfall. — klausnrooster, Sep 13 '20 at 13:47
I see the keypresses point was unneccessary. The `On Error Resume Next` to bypass the very real risk of "Subscript Out Of Range" errors would bite you. But that's just in your little example. You could probably "do it right". If you are the only one that has to maintain your projects, using Subs this way might not be a readability issue. I prefer the obvious assignment landmarks of "Set" and "=" in the caller. — klausnrooster, Sep 13 '20 at 14:37
I think if you're only dealing with returning a single item then a Function would be the "expected" way to go - certainly if reading someone else's code (or my own six months later) that's what I'd expect to see. The only time I use the Sub/ByRef approach is if I need to set the value of several items from a single call, and I don't want to create a class/struct wrapper to return those from a function. — Tim Williams, Sep 13 '20 at 17:02
...also if I do use the ByRef method I always tag the parameters explicitly with ByRef, even though that's the default, as a reminder to myself that those are "in/out" parameters. — Tim Williams, Sep 13 '20 at 17:31
@klausnrooster Object parameters are passed ByRef even if you specify ByVal I do believe No, they're not. When you pass an object, you are passing a pointer to that object. When passed ByVal the object itself can be modified in the Function/Sub (probably why you thought it was actually `ByRef`) but the pointer itself, if modified, is not returned to the caller. In the OP's example `Sub getWsS(ByVal Sheet As Worksheet, ...` would fail as the callers `ws` variable would not get updated. — chris neilsen, Sep 14 '20 at 03:59
I would put the `ByRef` argument(s) last in the list of arguments. — John Alexiou, Nov 06 '20 at 22:53

Cristian Buse · Accepted Answer · 2020-11-09T14:04:18.987

In your case, I would use the Function approach and here are a few reasons:

I can use the result of the Function without storing the returning variable:

With getWsF(ThisWorkbook, "Sheet1")
    '...
End With

Obviously, I would need to be sure it never returns Nothing or have some error handling in place.

or

DoSomething getWsF(ThisWorkbook, "Sheet1")

where DoSomething is a method expecting a Worksheet/Nothing

As @TimWilliams mentioned in the comments, if you don't expect multiple return values then this is the "expected" way to do it. A well-established convention is that if the method has no return value it should be a Sub. If it has a single return value then it should be a Function. If it returns multiple values then it should also be a Function and:
- you either use a Class or Type to pack them as one result
- or, the Function returns a primary value as the result and the rest of the return values ByRef (see @UnhandledException's answer for an example).
If you ever need to call the method with Application.Run then Function is safe. Using a Sub often leads to Automation Errors or code simply stops running after the method is executed. It doesn't matter if you need to use the result of the Function or not, don't call a Sub with Application.Run if you don't want nasty errors. Of course, to avoid issues with Application.Run, you could have a Function that doesn't get assigned a return value and still return the Worksheet ByRef but this would be too confusing for the reader.

Edit #1

Forgot to mention that the Application.Run automation errors are happening when calling methods from a different document (for Excel - different Workbook)

Edit #2

In this section I will try to address the proper description side of your question, without doing a begginer and an advanced explanation but a combined one.

Difference between a Sub and a Function

A Sub is just a function that does not return a value after the function executes. In lots of languages, such a function is called a Void Function.

The implications is that a Sub is just a stand-alone statement. It cannot be called from inside an expression. You can only call it with one of:

MySub [argsList]
Call MySub([argsList])

On the other hand, a Function can be used inside statements like:

arguments to other methods e.g. DoSomething MyFunction(...); Debug.Print MyFunction(...)
assignment e.g. x = MyFunction(...)
With blocks e.g. With MyFunction(...)
method chaining e.g. MyFunction(...).DoSomething

The convention mentioned above:

A well-established convention is that if the method has no return value it should be a Sub. If it has a single return value then it should be a Function

becomes quite clear when we understand that a Sub does something and a Function returns a single value, by definition.

Similarity between a Sub and a Function

Both value-returning functions (Function in VBA) and void functions (Sub in VBA) are receiving values as parameters. In VBA, it is possible to return results via ByRef parameters. Not all languages support ByRef parameters (e.g. Java - except modifying members of Objects for example).

Note that porting code from a platform that supports ByRef to another one that does not, can be quite time-consuming if the ByRef approach is abused in the source platform.

Difference between ByVal and ByRef parameters

Passing by value (ByVal):

a new memory space is allocated for a new variable which will be of local scope to the method being called
the contents of the original variable are copied in the newly allocated space of the new variable (for Objects the address of the interface Virtual Table is copied instead)
contents of the original variable are NOT changed regardless of what the method does
it is much safer because the programer does not need to keep in mind/care about other parts of the program (specifically the calling method)

Passing by reference (ByRef):

a new variable is created but no new memory space is allocated. Instead the new variable points to the memory space occupied by the original variable being passed from the calling method. Note that for Objects, the original variable is passed entirely (no new variable is created) unless the interface is different (passing a Collection as an Object type parameter - Object stands for IDispatch) but this is a discussion outside of the scope of this answer. Also note that if the parameter is declared as Variant, there are more complex operations happening to facilitate the redirection
the contents of the original variable can now be changed remotely because both the original variable and the newly created one point to the same memory space
it is considered more efficient because no new memory is allocated but this comes with the downside of increasing complexity

Comparison of the presented methods

Now that we have some understanding of the differences, we can look at both the presented methods. Let's start with the Sub:

Sub getWsS(ByRef Sheet As Worksheet, wb As Workbook, ByVal wsName As String)
    Set Sheet = Nothing
    On Error Resume Next
    Set Sheet = wb.Worksheets(wsName)
End Sub

First of all, there should be an On Error GoTo 0 statement before the End Sub because otherwise the Error 9 is propagated up the calling chain (if sheet not found) and can affect logic inside other methods, long after the getWsS method has returned.

The method name starts with the verb "get" which implies that this method returns something but the method declaration is a Sub which is, by definition, more like a Do Something than a Return Something. Readability is certainly affected.

There is a need for an extra ByRef parameter to return the single piece of result. Implications:

it affects readability
it requires a declared variable inside the calling method
the result cannot he chained/used in other expressions within the calling method
it requries the extra line Set Sheet = Nothing to make sure the original variable does not retain previous contents

Now, let's look at the Function approach:

Function getWsF(wb As Workbook, ByVal wsName As String) As Worksheet
    On Error Resume Next
    Set getWsF = wb.Worksheets(wsName)
End Function

Same as before, there should be an On Error GoTo 0 statement before the End Function because otherwise the Error 9 is propagated up the calling chain. Also, the Workbook can be passed ByVal as best practice.

Obvious differences:

the name getSomething is perfect for a function that returns Something. Readability is far better than the Sub couterpart
the reader/maintainer of the code instantly knows that the function returns a Worksheet just by looking at the return type (as opposed to looking through a list of ByRef parameters and figuring out which one is the return variable)
the result can be chained/used in expressions
no extra lines of code are needed, the default returning value is already Nothing
the most widely accepted convention is used

I've used CTimer and it seems like on my x64 machine, the Sub approach runs faster with about 20ms when running the methods for a million times. Are these minor efficiency gains worth the loss in readability and flexibility of use? That is something that only the maintainer of the code base can decide.

Thanks. There is this *proper description* part in my question. Could you have a crack at it. Ideally I would like to have two descriptions per procedure: one for the beginner and one for a more advanced user. I know what both procedures do, but I cannot put it into words (especially the Sub). In your answer there is stuff I wasn't aware of and some I do not understand yet, but will investigate. — VBasic2008, Nov 06 '20 at 22:01
@VBasic2008 Hi, I've added an **Edit #2** section to address your comment — Cristian Buse, Nov 09 '20 at 14:04
Thank you very much, this is quite extensive and educational for me. Now, you mentioned *otherwise the Error 9 is propagated up the calling chain*. My understanding was that either a worksheet object or `Nothing` will be returned, `End Function` and that's the end of the story. So you're saying that it's not so simple. Could you maybe address this in the comments in short and provide a link where I could try to understand the implications of not using `On Error Goto 0` at the end? — VBasic2008, Nov 09 '20 at 14:27
@VBasic2008 ```On Error Resume Next``` forces the error to be ignored. The function will return ```Nothing``` as expected but you can quickly check the error code by using something like ```Debug.Print Err.Number``` inside the calling method. In general if you use ```On Error Resume Next``` you should always have a corresponding ```On Error Go To 0```. Using ```On Error GoTo ErrHandler``` is a different story because the Err Object (global BTW) gets reset when the method exists or you use ```Resume Next```. — Cristian Buse, Nov 09 '20 at 14:32
@VBasic2008 Try reading this [article](http://www.cpearson.com/excel/errorhandling.htm) from the reputable Chip Pearson — Cristian Buse, Nov 09 '20 at 14:36
Sorry for bothering, but this is ground breaking for me. In the error handler, if I use e.g. `Resume CleanExit`, the error handler stays 'active' which I might not want. If I use `Goto CleanExit` instead, will this qualify for the `On Error Resume Next 'bad' case without On Error Goto 0` or is it OK? — VBasic2008, Nov 09 '20 at 14:40
@VBasic2008 Let me try to explain this differently. The Err Object is a global object and it always stores information about the last encountered error. If you use ```On Error Resume Next``` and an error occurs, it is ignored but the mentioned Err Object retains the error info even after exiting the method. This can be a real issue if in other parts of the code there is logic like ```If Err.Number = 0 Then```. The ```On Error Go To 0``` is the default mode and the mode gets reset by exiting the method. But, NOT the Err Object. So you can use either ```On Error Go To 0``` or ```Err.Clear``` — Cristian Buse, Nov 09 '20 at 14:50
Thanks for the link. I read it at least twice before, but obviously not with enough attention since I almost immediately found this part `Error Handling With Multiple Procedures` which I didn't know of. Even worse, I encountered it a few times in my code, but again, didn't pay much attention. Once again, thank you very much. — VBasic2008, Nov 09 '20 at 14:50
@VBasic2008 Continuation of the previous comment: ...So you can use either ```On Error Go To 0``` or ```Err.Clear``` to make sure you also reset the Global Err Object. I prefer the first because the second kind of implies that you are sure there is an error and you reset it while the first is more like "Go to default error mode". — Cristian Buse, Nov 09 '20 at 14:52
@VBasic2008 If you use ```On Error Resume Next``` and then ```Goto CleanExit``` this does not reset the global Err Object. The local Error handler is always reset when the method is exited but not the Err Object. But, if you use ```On Error GoTo CleanExit``` then the Err Object also gets reset when exiting the method. When in doubt, just write some test procedures and test the assumptions. It will help you better — Cristian Buse, Nov 09 '20 at 14:57

score 3 · Answer 2 · edited Sep 14 '20 at 13:24

To answer your question directly:

Q: Is it viable?
A: Yes, it will compile and carry out the functionality that you're expecting.

The grey area comes about when you ask should you do this.

There's definitely nothing stopping you (assuming you aren't subject to some company coding standards or anything). Typically however, functions are used to take in parameters, perform some kind of logic and return a value or object at the end of that logic.

Functions are typically non-destructive and don't change the values or properties of the input parameters. This becomes especially important for code readability and maintenance because other developers (and even yourself a few months from now) will read the code expecting functions to behave in a certain way.

Sub routines on the other hand are not expected to return anything, and so they are used to run concise, related sections of code that carry out some kind of logic relevant to the application. Going back to the point of readability and maintenance, it's reasonable to assume that objects and properties will change inside of a sub routine and so again this makes the developer's life a little easier.

Ultimately there's no hard and fast rules - but there are years of experience and best practice which are often good advice to take on in these scenarios :)

score 0 · Answer 3 · answered Sep 14 '20 at 13:47

0

A good example for using both, a function and a ByRef parameter is a 'Try' function:

Public Function TryGetMyValue(ByRef outMyValue As Long) As Boolean
    On Error Goto Catch

    outMyValue = GetMyValue() 'Does anything to retrieve the value and could raise an error...

    TryGetMyValue = True

Done:
    Exit Function

Catch:
    Resume Done
End Function

It could be used like that:

Public Sub TestTryGetMyValue()
    Dim myValue As Long
    If Not TryGetMyValue(myValue) Then Exit Sub

    Debug.? "MyValue is: " & myValue
End Sub

answered Sep 14 '20 at 13:47

AHeyne

3,377
2
11
16

Your answer is kind of out of my league, but I'm willing to understand. Does the line `outMyValue = GetMyValue()` have to be changed to `outMyValue = TryGetMyValue()`? Could you share a more 'practical' example, including a scenario where the `Catch` is involved? – VBasic2008 Nov 06 '20 at 22:35
`GetMyValue` is just a sample. It could be a function you programmed/provide to set the ref-parameter `outMyValue`. If there happens an error, it will be catched by `TryGetMyValue` and so `TryGetMyValue` returns false, because `TryGetMyValue = True` won't be executed. If no error occurs, `TryGetMyValue = True` will be executed and `TryGetMyValue` returns true. Your calling code just checks this returned value (`True`/`False`) and in case it is `True`, you know that the value could be retrieved correctly and now is in your variable `myValue` of you calling procedure. – AHeyne Nov 07 '20 at 08:03

Function vs Sub(ByRef)

3 Answers3

Linked